ESTIMATING  SINGLE  AND  MULTIPLE  TARGET  LOCATIONS  USING 
K-MEANS  CLUSTERING  WITH  RADIO  TOMOGRAPHIC  IMAGING  IN 
WIRELESS  SENSOR  NETWORKS 

THESIS 

Jeffrey  K.  Nishida,  Captain,  USAF 
AFIT-ENG-MS- 1 5-M-038 


DEPARTMENT  OF  THE  AIR  FORCE 
AIR  UNIVERSITY 

AIR  FORCE  INSTITUTE  OF  TECHNOLOGY 


Wright-Patterson  Air  Force  Base,  Ohio 


DISTRIBUTION  STATEMENT  A: 

APPROVED  FOR  PUBLIC  RELEASE;  DISTRIBUTION  UNLIMITED 


The  views  expressed  in  this  thesis  are  those  of  the  author  and  do  not  reflect  the  official 
policy  or  position  of  the  United  States  Air  Force,  the  Department  of  Defense,  or  the  United 
States  Government. 


This  material  is  declared  a  work  of  the  U.S.  Government  and  is  not  subject  to  copyright 
protection  in  the  United  States. 


AFIT-ENG-MS- 1 5-M-038 


ESTIMATING  SINGLE  AND  MULTIPLE  TARGET  LOCATIONS  USING  K-MEANS 
CLUSTERING  WITH  RADIO  TOMOGRAPHIC  IMAGING  IN  WIRELESS  SENSOR 

NETWORKS 


THESIS 


Presented  to  the  Eaculty 

Department  of  Electrical  and  Computer  Engineering 
Graduate  School  of  Engineering  and  Management 
Air  Eorce  Institute  of  Technology 
Air  University 

Air  Education  and  Training  Command 
in  Partial  Fulfillment  of  the  Requirements  for  the 
Degree  of  Master  of  Science  in  Electrical  Engineering 


Jeffrey  K.  Nishida,  B.S.E.E. 
Captain,  USAE 

March  2015 


DISTRIBUTION  STATEMENT  A: 

APPROVED  EOR  PUBLIC  RELEASE;  DISTRIBUTION  UNLIMITED 


AFIT-ENG-MS- 1 5-M-038 


ESTIMATING  SINGLE  AND  MULTIPLE  TARGET  LOCATIONS  USING  K-MEANS 
CLUSTERING  WITH  RADIO  TOMOGRAPHIC  IMAGING  IN  WIRELESS  SENSOR 

NETWORKS 


Jeffrey  K.  Nishida,  B.S.E.E. 
Captain,  USAE 


Committee  Membership: 

Richard  K.  Martin,  PhD 
Chair 

Captain  Jesse  D.  Peterson,  PhD 
Member 

Jason  R.  Pennington,  PhD 
Member 


AFIT-ENG-MS- 1 5-M-038 


Abstract 


Geolocation  involves  using  data  from  a  sensor  network  to  assess  and  estimate  the 
location  of  a  moving  or  stationary  target.  Received  Signal  Strength  (RSS),  Angle  of  Arrival 
(AoA),  and/or  Time  Difference  of  Arrival  (TDoA)  measurements  can  be  used  to  estimate 
target  location  in  sensor  networks.  Radio  Tomographic  Imaging  (RTI)  is  an  emerging 
Device-Free  Localization  (DFL)  concept  that  utilizes  the  RSS  values  of  a  Wireless  Sensor 
Network  (WSN)  to  geolocate  stationary  or  moving  target(s).  The  WSN  is  set  up  around 
the  Area  of  Interest  (Aol)  and  the  target  of  interest,  which  can  be  a  person  or  object.  The 
target  inside  the  Aol  creates  a  shadowing  loss  between  each  link  being  obstructed  by  the 
target.  This  research  focuses  on  position  estimation  of  single  and  multiple  targets  inside 
a  RTI  network.  This  research  applies  K-means  clustering  to  localize  one  or  more  targets. 
K-means  clustering  is  an  algorithm  that  has  been  used  in  data  mining  applications  such 
as  machine  learning  applications,  pattern  recognition,  hyper-spectral  imagery,  artificial 
intelligence,  crowd  analysis,  and  Multiple  Target  Tracking  (MTT). 
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ESTIMATING  SINGLE  AND  MULTIPLE  TARGET  LOCATIONS  USING  K-MEANS 


CLUSTERING  WITH  RADIO  TOMOGRAPHIC  IMAGING  IN  WIRELESS  SENSOR 

NETWORKS 


I.  Introduction 

This  chapter  provides  background  on  the  methods  and  application  of  WSNs  and  Radio 
Tomographic  Imaging  (RTI).  The  thesis  problem  statement,  assumptions,  research 
objectives,  approach  used,  and  structure  for  this  thesis  are  contained  in  this  chapter. 

1.1  Background 

The  growth  and  maturity  of  wireless  communication  and  Micro  Electro-Mechanical 
Systems  (MEMS)  technology  has  laid  the  foundation  for  the  use  of  low  power,  low  cost 
Radio  Erequency  (RE)  sensors  in  various  geolocation  tasks  [4],  A  WSN  involves  multiple 
Radio  Erequency  Integrated  Circuits  (REICs)  deployed  around  an  area  of  interest.  The 
REICs  are  often  referred  to  as  a  radio,  node,  or  mote  which  can  be  used  interchangeably. 
Each  node  in  the  network  is  capable  of  sending  and  receiving  information  over  a  wireless 
communication  channel.  A  variety  of  applications  have  been  explored  to  utilize  the  use  of 
WSN  to  support  both  military  and  civilian  applications.  Although  geolocation  with  Ultra- 
Wideband  (UWB)  radar  has  provided  much  of  the  framework  in  WSN  applications  [5],  [6], 
WSN  differ  such  that  a  larger  amount  of  nodes  can  be  deployed.  This  is  feasible  because 
such  networks  are  mobile,  have  flexible  uses,  and  are  easily  implemented  due  to  their  low 
cost.  WSNs  with  a  large  amount  of  nodes  have  uses  in  inventory  monitoring,  surveillance, 
classification,  and  localization  [7]. 
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Recent  research  into  the  application  and  effectiveness  of  WSNs  for  the  use  in 
surveillance,  localization,  and  classification  have  led  to  an  interest  from  military,  special 
forces,  and  the  emergency  response  community  [1],  [8],  [9],  [10]. 

1.2  Radio  Tomographic  Imaging 

Geolocation  involves  using  data  from  a  sensor  network  to  asses  and  estimate  the 
location  of  a  moving  or  stationary  target.  RSS,  AoA,  and/or  TDoA  measurements  can 
be  used  to  estimate  target  location  in  sensor  networks.  RTI  uses  the  RSS  information 
from  each  radio  to  estimate  the  position  of  the  target(s).  RTI  is  an  emerging  DFL  concept 
that  utilizes  the  RSS  values  of  a  WSN  to  geolocate  a  stationary  or  moving  target.  Every 
wireless  node  is  a  2- way  communication  link  that  can  transmit  and  receive  RSS  values  over 
the  specified  communication  channel  [5].  The  WSN  is  set  up  around  the  Aol  and  the  target 
of  interest,  which  can  be  a  person  or  object.  The  target  inside  the  Aol  creates  a  shadowing 
loss  between  each  link  being  obstructed  by  the  target.  This  research  focuses  on  position 
estimation  of  single  and  multiple  targets  inside  a  RTI  network.  In  the  literature,  the  focus 
has  been  on  single  targets  using  a  Maximum  A-posteriori  Probability  (MAP)  estimator  [5], 
[1],  [10].  This  research  will  apply  K-means  clustering  to  localize  one  or  more  targets. 
K-means  clustering  is  a  known  algorithm  used  in  other  data  mining  applications  such  as 
among  machine  learning  applications,  pattern  recognition,  hyper-spectral  imagery,  artificial 
intelligence,  crowd  analysis,  and  MTT  [11],  [12],  [13]. 

1.3  Problem  Statement 

Can  K-means  clustering  be  utilized  with  an  indoor  RTI  network  to  localize  one  or 
more  targets? 

The  motivation  behind  using  K-means  clustering  is  to  provide  an  alternative  means 
to  localize  target(s)  inside  a  RTI  network.  Additionally,  localizing  multiple  targets  in  RTI 
has  been  a  difficult  task.  MTT  has  possible  law  enforcement,  special  forces,  and  military 
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application.  For  example,  in  applications  such  as  a  hostage  situation,  special  forces  would 
want  to  be  able  to  localize  multiple  targets  inside  the  building  [10],  [11],  [12],  MTT  would 
be  useful  in  gaining  insight  to  where  all  targets  inside  the  Aol  are  located. 

1.4  Approach 

This  thesis  will  include  theoretical  analysis  and  background  as  the  foundation  of 
this  research.  The  use  of  simulation  and  physical  experiments  will  be  used  to  support 
the  objective  of  this  research.  RTI  experiments  done  in  real-time  and  with  obstructions 
provide  experimental  results  that  can  be  analyzed.  The  data  from  the  WSN  will  be 
collected  in  the  form  of  RSS  measurements  and  the  information  from  the  network  will 
illustrate  the  attenuation  caused  by  the  affects  of  targets  inside  the  network.  Regularization, 
weighting  models,  image  reconstruction,  and  localization  estimation  techniques  will  be 
used  to  provide  results  to  be  compared  with  simulations. 

1.5  Thesis  Structure 

The  remainder  of  this  research  document  is  arranged  into  four  chapters.  Chapter  2 
provides  an  in  depth  literature  review  of  the  research  in  the  field  of  WSNs,  RTI,  and 
MTT.  Chapter  3  describes  the  methodology  used  in  the  completion  of  this  research.  It 
also  describes  how  all  experiments  are  set  up  and  how  all  data  will  be  analyzed.  Chapter  4 
contains  all  experimental  results  in  support  of  this  research.  Analytic  results  relative  to  the 
objective  of  the  problem  statement  are  presented  in  Chapter  4.  Chapter  5  summarizes  all 
the  research  conducted  in  this  document  and  provides  the  conclusion  of  what  work  has  been 
accomplished  and  the  contributions  of  this  research.  Lastly,  Chapter  5  describes  additional 
research  areas  that  can  follow  on  to  this  research. 
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II.  Related  Work 


This  chapter  provides  an  introduction  and  background  to  the  theory  behind  RF-based 
localization  methods.  The  research  efforts  and  evolution  of  RF-based  localization 
methods  have  provided  the  foundation  for  RTf.  RTf  is  a  DFL  method  that  uses  a  WSN  to 
geolocate  the  position  of  one  or  multiple  targets.  Geolocation  involves  using  data  from 
a  sensor  network  to  assess  and  estimate  the  location  of  a  moving  or  stationary  target. 
RSS,  AoA,  and/or  TDoA  measurements  can  be  used  to  estimate  target  location  in  sensor 
networks  [14],  [15],  [16].  RTf  uses  the  RSS  information  from  each  radio  to  estimate  the 
position  of  the  target(s)  [5]. 

2.1  Notational  Conventions 

Throughout  the  paper,  and  denote  a  matrix  inverse  and  transpose  respectively. 
A  hat  (e.g.  x)  indicates  an  estimate  of  its  argument  and  a  bar  (e.g.  x)  represents  the 
ensemble  or  sample  mean  of  the  argument.  All  column  vectors  are  indicated  with  bold 
lower  case  letters,  row  vectors  are  denoted  with  a  transpose  operator,  and  matrices  are 
denoted  by  capital  BOLD  letters. 

2.2  Radio  Tomographic  Imaging  Background 

RTI  is  an  emerging  concept  that  uses  DFL  and  the  RSS  values  of  a  WSN  to  geolocate 
a  stationary  or  moving  target.  The  WSN  is  set  up  around  the  Aof  and  the  target  of  interest, 
which  can  be  a  person  or  object.  The  target  inside  the  Aof  creates  a  shadowing  loss  between 
each  link  being  obstructed  by  the  target  [1]. 

2.2.1  Ultrawideband  Imaging. 

RTI  is  a  derivative  of  RF-based  radar  applications  from  the  commercial  industry.  From 
[1],  UWB-based  imaging  devices  have  been  developed  by  various  companies  which  use 
phased  array  radars  to  estimate  range  and  bearing.  An  UWB  network  consisting  of  multiple 
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radar  transmitter  and  receivers  can  be  set  up  around  a  concentrated  Aol  to  geolocate  a 
target.  This  process  can  be  referred  to  as  active  localization  [17].  In  order  to  estimate 
range  and  bearing,  the  devices  emit  UWB  pulses  to  measure  the  echoes  from  the  devices. 
Based  on  the  estimates  for  the  the  change  in  range  and  bearing  when  a  target  is  present, 
an  image  of  the  Aol  can  be  estimated  to  show  the  estimated  location  of  the  target.  The 
estimated  image  can  be  mapped  to  a  pixel  scene  of  the  Aol  to  show  the  presence  or  absence 
of  target(s)  [18].  The  benefits  of  UWB  is  that  it  is  device  free,  can  offer  accurate  position 
estimation  of  a  target,  and  is  passive  [17],  [18],  [19].  The  challenges  with  UWB  is  that  it 
requires  a  large  bandwidth,  can  be  expensive,  and  suffers  monostatic  scattering  losses  over 
larger  areas  [5],  [6]. 

2.2.2  Multiple-Input-Multiple-Output  Radar. 

Multiple  Input,  Multiple  Output  (MIMO)  radar  has  been  an  emerging  held  that  utilizes 
multiple  radars  transmitter  and  receivers  to  geolocate  objects  within  an  area  in  which  the 
radars  surround.  MIMO  is  often  referred  as  a  type  of  multistatic  radar.  From  [20],  MIMO 
is  used  for  target  detection.  The  waveforms  from  the  transmitters  are  scattered  from  the 
target  and  the  receivers  are  able  to  resolve  the  waveforms  to  geolocate  the  target  inside  the 
spatial  area.  It  has  also  been  shown  that  MIMO  can  be  used  to  track  moving  targets  by 
computing  the  Doppler  shift.  RTI  eliminates  the  need  to  measure  reflections,  but  instead 
uses  shadowing  loss  as  the  basis  for  the  image  reconstruction  inside  the  Aol  [21]. 

2.2.3  Device-Free  Localization. 

The  access  to  and  growing  usage  of  Wireless  Local  Area  Networks  (WLAN)  have 
allowed  for  the  increase  of  DFL  systems.  Active  based  systems  such  as  Global  Positioning 
System  (GPS),  various  RF  based  systems.  Ultrasonic  based  systems,  and  Infrared  (IR) 
based  systems  require  a  device  attached  to  the  target  in  some  fashion  in  order  to  localize 
the  target.  DFL  does  not  require  an  emitter  from  the  target  being  tracked,  thus  is  an 
unobtrusive  way  to  estimate  the  position  of  a  target.  Observing  changes  in  the  RSS  of 
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a  WLAN  environment  is  a  technique  that  can  he  used  localize  a  target  in  a  passive  DFL 
environment  [22],  [23]. 

The  growth  of  DFL  systems  and  advancements  in  WLAN  communication  have 
provided  the  motivation  to  research  DFL  localization  methods.  Approximating  a  target’s 
location  has  provided  useful  to  applications  such  as,  unobtrusively  monitoring  patients  in  a 
hospital,  estimation  location  of  assets,  network  access  based  on  user’s  location,  and  indoor 
traffic  monitoring  [22]. 

2.2.4  Radio  Tomographic  Imaging  with  Received  Signal  Strength. 

With  the  growth  of  low  cost  RFICs,  RTI  has  been  enabled  to  grow  as  an  emerging 
technology  in  the  realm  of  DFL.  RTI  uses  RSS  measurements  from  a  RF  network  that  is 
deployed  around  an  area  of  interest.  All  the  radios  in  the  network  are  capable  of  receiving 
and  transmitting  with  one  another.  The  attenuation  created  by  the  objects  or  people  inside 
the  network  are  utilized  to  obtain  images  of  the  network  area.  Due  to  noise  in  the  channels, 
noise  models  are  investigated  in  the  RTI  system.  Due  to  noise,  regularization  methods  have 
been  explored  to  estimate  the  image  of  the  RTI  network.  Error  bounds  on  the  image  can  be 
used  to  calculate  the  accuracy  of  a  particular  RTI  network  [1]. 

Unique  Links.  Since  all  the  radios  in  the  RTI  network  can  transmit  and  receive  RSS 
among  one  another,  the  number  of  two-way  unique  links,  M,  can  be  calculated  as. 


M  = 


(2.1) 


where  N  is  the  number  of  radios  in  the  RTI  network  [I].  Figure  2.1  is  an  illustration  of  all 
the  links  of  a  RTI  network  with  N  =  36  nodes  and  M  =  630  links. 


Received  Signal  Strength.  RTI  uses  RSS  to  measure  signal  power  from  one  radio 
to  another  in  the  network.  From  [5],  the  Received  Signal  Strength  Indicator  (RSSI)  from 
the  network  is  the  only  information  needed  to  localize  targets.  The  hardware  can  remain 
simple  because  no  other  information  is  needed  in  this  RSS  system.  In  the  literature,  RSS 
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Figure  2.1:  Illustration  of  the  links  created  in  a  RTl  network  [1], 


measurements  are  typically  modeled  as  log-normal  with  a  Gaussian  distribution.  The 
received  power,  Pi  at  each  link,  I  over  the  wireless  channel  is  [5],  [24], 

N[P{di),cT^).  (2.2) 

Path  Loss  Model.  The  RTl  path  loss  model  descrihes  the  RSS  loss  due  to  shadowing 
loss  from  objects,  fading  loss,  static  losses,  and  measurement  noise  for  each  link,  I  in  the 
network.  The  RSS  of  any  given  link  I,  at  time  t,  can  mathematically  be  computed  as  [1], 
[10]: 


ri(t)  =  Pt-  Li{t)  -  S  lit)  -  Flit)  -  viit). 


(2.3) 


where 
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Figure  2.2:  Illustration  of  a  single  obstrueted  link  in  a  RTI  network  [1]. 


•  Pj'.  Transmitted  power  (deeibels  (dB)). 

•  Li{t)\  Statie  losses  due  to  distanee,  antenna  patterns,  deviee  ineonsisteneies,  ete  (dB). 

•  S lit):  Shadowing  loss  due  to  objeets  attenuating  the  signal  (dB). 

•  Flit):  Fading  loss  eaused  by  construetive  and  destruetive  interferenee  of  narrow-band 
signals  in  multipath  environments  (Non-Shadowing  Loss)  (dB). 

•  viit):  Measurement  Noise  (dB). 

Radio  Tomographic  Imaging  Linear  Model.  The  entire  veetor  of  RSS  links  ean  be 
deseribed  in  matrix  form  from  the  following  linear  model  [1] 

y  =  Wx  -I-  n,  (2.4) 

where  y  is  the  ehange  in  RSS  from  the  baseline,  whieh  has  length  M.  W  is  a  weight 
matrix  of  dimension  M  x  P,  where  M  eorresponds  to  the  number  of  links  and  P  represents 
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the  number  of  pixels  in  the  given  RTI  network.  Each  RSS  measurements  is  measured  in 
decibels  {dB). 

Noise.  The  noise,  n  from  the  model  from  (2.4)  is  typically  modeled  as  Additive 
White  Gaussian  Noise  (AWGN)  [25],  [26],  [27],  [28].  The  noise  can  empirically  be 
modeled  as 

N(0,cr]),  (2.5) 

where  cr]  is  the  measured  variance  of  the  link  data  for  the  particular  RTI  network.  As 
discussed  in  [1],  the  statistics  of  the  noise  vector  must  be  examined.  From  [28],  the  main 
contributors  to  the  noise,  n  is:  the  free  space  path  loss,  loss  due  to  shadowing,  receiver 
gains  (which  can  be  antenna  gain  and/or  cabling  losses),  and  transmitter  gains.  Hamilton 
also  assumes  a  single-path  propagation,  but  notes  that  it  can  be  extended  to  multi-path 
channels.  In  [1],  a  Gaussian  mixture  model  was  used  to  fit  the  measured  data.  The  two-part 
log-normal  mixture  model,  with  values  in  decibels,  can  be  modeled  as 

A,(“)  =  T  —f=  Ai  •  <2.6) 

where  fmiu)  is  the  probability  density  function  of  the  random  noise  variable  n,,  pj  is  the 
probability  for  part  j,  and  cr^  is  the  variance  of  part  j.  This  model  is  based  off  the  results 
from  [29]. 

2.3  Weighting  Models 

If  absolute  knowledge  of  the  area  of  interest  was  available,  the  weights  for  every  link, 
I,  at  each  pixel  would  be  definitely  known.  In  time  critical  situations  where  RTI  would  be 
utilized,  users  will  likely  not  have  the  luxury  of  surveying  the  scene  for  all  obstructions, 
interior  arrangements,  or  have  access  to  any  other  site  specific  information.  This  is  why 
a  statistical  model  for  W  needs  to  be  robust  enough  in  an  array  of  different  environments 
and  network  sizes.  In  the  literature,  various  models  have  been  explored  to  represent  the 
weighting  matrix  W  from  (2.4).  Although  W  has  taken  on  various  forms  in  the  literature. 
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the  weighting  matrix  can  be  decomposed  into  two  parts  as  shown  in  [30],  [31].  The  general 
form  for  W  can  be  decomposed  as 

W  =  S  O  a,  (2.7) 

where  S  is  a  binary  selection  matrix,  O  is  an  element-wise  Hadamard  multiplication,  and 
n  is  a  real-valued  matrix  of  weights  assigned  to  each  pixel  in  the  network.  Using  singular 
value  decomposition  (SVD),  W  can  be  also  be  represented  as 

W  =  ULV^  (2.8) 


where  U  and  V  are  unitary  matrices,  and  S  is  a  diagonal  matrix  of  singular  values  [32]. 
The  three  weighting  matrices  commonly  found  in  RTI  literature  are:  the  NeSh  Model  [1], 
the  Line  Model  [31],  [33],  and  the  NeSh-Line  Model  [25],  [26]  with  the  NeSh  Model  being 
the  most  widely  used.  There  are  also  other  weighting  models  described  in  [23],  [27],  [28], 
[34],  and  [35],  but  they  are  not  utilized  as  frequently  in  literature  as  the  first  three  mentioned 
models. 


2.3.1  NeSh  Normalized  Ellipse  Model. 

The  NeSh  Model  was  first  used  in  [1],  but  has  since  been  expanded  in  [36].  The 
expanded  model  has  been  used  in  [6],  [37],  [38],  [39],  [40].  The  NeSh  Model  was  designed 
to  take  into  account  the  shadowing  loss  described  in  (2.3),  on  each  link  1.  The  adapted 
model  uses  a  normalization  factor  to  take  into  account  that  as  the  distance  of  links  increase, 
the  variance  should  not  as  well.  The  normalized  factor  from  [36]  is 


QNeSh  ^  _ 

^/dl 


(2.9) 


From  [1],  an  ellipsoid  with  a  focus  at  each  radio  location  is  used  to  determine  the  weighting 
for  each  link  of  the  network.  The  NeSh  weighting  is  described  mathematically  as 


w 


NeSh 

l,p 


-r 

^  (o, 


if  di(l, p)  +  d2{l, p)  <  di  +  A, 
otherwise. 


(2.10) 
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where  di  is  the  distance  or  length  of  link  I,  di(l,  p)  is  the  distance  from  the  first  node  of  link 
I  to  the  location  of  pixel  p  and  d2{l,  p)  is  the  distance  from  the  second  node  of  link  I  to  to 
the  location  of  pixel  p.  Ah  di  tunable  parameter  which  represents  the  width  of  the  ellipse.  If 
the  sum  of  d\  and  d2  is  less  than  the  length  of  the  link  d  plus  the  tunable  parameter  /I,  a  1  is 
assigned  in  the  binary  selection  matrix,  else,  a  0  is  assigned.  Using  the  general  decomposed 
form  for  W  from  (2.7),  the  decomposed  model  for  W  is 

■^NeSh  _  ^Ellipse  (2  11) 

This  model  puts  the  weight  only  on  pixels  that  fall  within  the  ellipsoid  computed  for  each 
link.  However,  this  model  assumes  that  all  pixels  that  fall  within  the  ellipsoid  have  equal 
weight. 

2.3.2  Line  Model. 

The  Line  Model  originated  from  [33],  [41],  but  have  been  utilized  in  RTI  applications 
in  [26],  [31],  [42].  is  only  concerned  with  the  portion  of  link  I  that  passes  through 
pixel  p.  The  binary  selection  matrix,  =  1  if  link  I  traverses  through  pixel  p,  else 
^Line  _  Q  xherefore,  the  weighting  matrix  for  the  Line  Model  can  mathematically  can  be 
computed  as 

' 

1,  if  link  /  pixel  p, 

(2-12) 

0,  otherwise, 

where  Li  p  is  the  length  of  the  portion  of  link  I  that  traverses  through  pixel  p.  The  weight  of 
each  entry  in  the  matrix  is  assigned  based  on  the  length  of  the  link  through  the  pixel  rather 
than  the  square  root  of  the  distance  of  the  entire  link  as  shown  in  the  NeSh  Normalized 
Ellipse  Model.  Similar  to  (2.7),  W  can  be  decomposed  as 

y^Line  ^^Line  (2.13) 

where  the  Line  Selection  Matrix  is  described  more  in  depth  in  [33],  [42],  [41].  The  Line 
Model  is  simple  to  implement  and  in  [31],  the  Line  Model  is  described  as  being  the  more 
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computationally  efficient  model  over  the  other  commonly  used  weighting  models  used  in 
RTI. 

2.3.3  NeSh-Line  Model. 

The  NeSh-Line  Model  is  a  hybrid  of  the  NeSh  and  Line  models  previously  discussed 
in  this  section.  This  model  was  first  used  in  [25],  [26].  The  Line  Selection  Matrix,  is 
the  same  one  found  in  the  decomposed  Line  Model.  The  weighting  factor,  H  is  computed 
by  calculating  the  link  distanee  of  I,  similar  to  the  Line  Model,  and  the  inverse  of  the  square 
root  of  the  distance  of  the  link  that  traverses  through  pixel  p,  which  is  similar  to  the  NeSh 
Model.  W  for  the  Nesh-Line  Model  can  be  decomposed  as 


-^^NeShLine  _  ^^eShLine  q  ^Line 


(2.14) 


where 


, ,  ,NeS  hLine 
^i.p 


L 


if  link  I  traverses  pixel  p. 
otherwise. 


(2.15) 


2.4  Regularization  Methods 

Since  the  output  from  the  RTI  network,  y  from  (2.4)  is  the  only  output  given  from  the 
network,  the  image  seene,  x  from  (2.4)  needs  to  be  estimated.  Wilson  and  Patwari  diseuss 
different  methods  to  estimate  x  in  [1],  [32].  Sinee  the  goal  is  to  estimate  x  and  minimize  the 
noise  in  the  least-squared  error  sense,  various  regularization  methods  have  been  explored. 

Ill-Posed  Inverse  Problem.  The  linear  model  from  (2.4)  is  eommon  in  other  physieal 
problems  [1],  where  the  goal  would  be  to  minimize  the  noise  in  the  least-squared  sense 
which  can  mathematically  be  represented  as 


Xis  =  argmin||Wx-y|p.  (2.16) 

Using  to  (2.16),  the  least-squared  solution  for  (2.4)  would  be 


xl5  =  (W^W)-'W^y. 


(2.17) 
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However,  W  in  most  cases  will  not  be  full  rank,  thus  estimating  x  is  an  ill-posed  inverse 
problem.  Due  to  the  transfer  matrix,  W  having  much  smaller  values  than  the  measurement 
noise,  regularization  is  useful  in  estimating  x.  Without  any  type  of  regularization,  the 
measurement  noise  would  be  amplified  when  solving  for  x  [32],  Below  are  a  few  popular 
regularization  methods  researched  for  RTI. 

2.4.1  Tikhonov  Regularization. 

Tikhonov  regularization  is  the  most  widely  used  in  RTI.  It  is  a  popular  regularization 
because  it  forces  a  solution  by  adding  an  energy  term  [1],  [32].  This  regularization  has 
the  flexibility  to  manipulate  the  desired  output  by  picking  the  regularization  parameter  a. 
The  solution  is  outputed  after  the  linear  transformation  of  the  measurement  data  [32].  The 
resulting  objective  cost  function  is 

friKix)  =  i||Wx  -  y||2  +  cr  (liD.xlP  +  HD^xH^) ,  (2.18) 

where  D;^  and  Dj,  are  difference  operators  in  the  x  and  y  directions  of  x  respectively.  To  And 
the  estimated  scene  x,  the  derivative  of  (2.18)  needs  to  be  set  equal  to  zero  which  is 

Xtik  =  argmin  (^liWx-ylP  +  a(||D,x||2  +  ||DyX||2)j,  (2.19) 

Xtik  =  (W^W  +  a  +  Dj'Dy))"'  W^y.  (2.20) 

In  [I],  the  derivative  operators  are  summarized  by  the  Tikhonov  matrix  Q.  Substituting  Q 
into  (2.20)  yields 

Q^D^Dx  +  Dj'Dy,  (2.21) 

Xtik  =  (W^W  +  nQ)"^  W^y.  (2.22) 

In  matrix  form,  the  linear  operator  on  y  can  be  demonstrated  by 

xt, 1=117-, -^y,  (2.23) 

where 

n™  =  (W^W  +  n  (D^D,  +  Dy^Dy))“'  W^.  (2.24) 
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2.4.2  Truncated  Singular  Value  Decomposition. 

Another  popular  regularization  method  is  Truneated  Singular  Value  Deeomposition 
(TSVD)  [32].  The  ohjective  is  to  remove  smaller  singular  values  from  the  weighting  matrix 
W.  This  method  is  similar  to  sealar  regularization  [43],  where  only  g  <  N  singular  values 
from  S  are  used  in  the  reconstruction.  The  linear  transformation  matrix  is  given  by 


g<N  , 

^TSVD  =  y  -u,-vf  =  U,EV/, 


(2.25) 


where  U,  V,  and  S  are  matrices  described  in  section  2.3.  Therefore,  the  image  estimate 
using  TSVD  regularization  is 

^TsvD  =  (2.26) 


The  drawback  to  this  method  is  that  since  the  singular  vectors  are  dependent  on  the  node 
locations,  TSVD  lacks  the  ability  to  incorporates  the  parameter  a  to  force  desired  properties 
of  the  image  estimate.  However,  like  Tikhonov  regularization,  the  transformation  matrix, 
^TsvD  can  be  pre-calculated  prior  to  recording  data  for  quick  and  real-time  applications. 
The  results  from  [32]  show  that  Tikhonov  and  Total  Variation  (TV)  do  a  better  job  in 
minimizing  noise  present  in  the  image  estimate.  This  is  due  to  the  high  frequency 
components  that  are  included  in  the  reconstruction. 


2.5  Node  Density 

The  node  density  of  a  RTI  network  can  greatly  affect  the  accuracy  of  the  image  scene 
X.  The  more  dense  a  network  is,  the  more  likely  the  accuracy  would  be  higher  than  a 
network  with  a  sparse  amount  of  motes  further  apart.  The  more  links  that  pass  through  a 
particular  area,  the  more  RSS  values  would  be  present  to  estimate  the  image  scene  [1]. 

Cramer-Rao  Lower  Bound.  Wilson  and  Patwari  derive  the  Cramer-Rao  Lower 
Bound  (CRLB)  for  the  unbiased  estimator  Xrik  [!]■  The  CRLB  is  the  error  bound  at  each 
pixel  location  p  of  the  particular  RTI  network.  The  Mean  Squared  Error  (MSE)  bound  for 
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a  RTI  network  is  given  by 


C0V{iiTik)  >  (rW^w  +  c;i) 


(2.27) 


where  y  is  eomputed  by  the  following  integration 


— oo 


(2.28) 


fn  is  the  two-component  Gaussian  distribution  of  n  found  in  (2.6).  is  the  spatial 
covariance  model  used  in  [44].  The  spatial  coveriance  model  is  computed  by 


[Q] 


2 

cr^e^. 


(2.29) 


where  is  the  variance  at  each  pixel,  dp^q  is  the  distance  from  each  pixel  p  to  pixel  q,  and 
5c  is  the  correlation  parameter. 

Cylindrical  Human  Model.  In  [1],  Wilson  and  Patwari  use  a  Cylindrical  Human 
Model  to  assess  the  accuracy  of  a  given  RTI  network.  The  purpose  is  to  assess  the 
"true"  attenuation  field  to  the  image  scene  being  estimated.  The  model  assumes  a  uniform 
attenuation  throughout  the  radius  R/,,  of  a  human  positioned  at  a  coordinate  location  C/,. 
The  model  for  the  Cylindrical  Human  Model  image  scene  x/,  can  be  described  as 


X/,  = 


if  ||xp  -  C/,||  <  Rh, 
otherwise. 


(2.30) 


where  Xp  is  the  (x,  y)  center  of  pixel  p. 

Spherical  Model.  In  [42],  Martin  et  al.  propose  a  spherical  model  to  represent  a 
spherical  obstruction.  The  obstruction  in  x  is 


.p  -  Acxp\—\\x{p)  -  Co\[ 


(2.31) 


where  A  is  attenuation  (dB)  per  voxel  of  obstruction,  is  the  defined  radius  of  the 
obstruction,  y  is  derived  from  the  noise  model  found  in  (2.28),  and  Co  is  the  coordinate 
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position  of  the  obstruction.  The  link  passing  through  the  center  of  the  obstruction  would 
have  a  link  attenuation  of 


yi 


Xpdx  A 


(2.32) 


h  6 

2.6  Radio  Tomographic  Imaging  Methods 

Since  the  creation  of  RTI,  other  methods  of  RTI  have  been  explored  to  improve 
tracking  and  estimation  of  targets  in  RSS -based,  DFL  networks.  The  various  methods  were 
created  to  improve  accuracy  in  an  array  of  different  applications.  For  example,  different 
methods  of  RTI  can  be  utilized  to  track  moving  targets  versus  stationary  targets.  The  WSN 
may  be  Line-of-Sight  (LOS)  or  may  need  to  be  set  up  with  obstructions  between  the  sensors 
and  the  target(s).  The  different  RTI  methods  have  varying  capabilities  that  can  be  applied 
to  the  applicable  target  tracking  situation. 

2.6.1  Mean-Based  Radio  Tomographie  Imaging. 

Mean-based  Radio  Tomographic  Imaging  (MRTI)  is  a  commonly  used  method  in  the 
literature  and  is  one  of  the  simplest  to  implement  in  terms  of  complexity.  This  method  is 
also  referred  to  as  shadowing-based  RTI  as  it  quantihes  the  loss  on  each  link  affected  by  the 
target  to  localize  the  target’s  location  [1].  Mean-based  or  shadowing-based  RTI  is  typically 
referred  to  as  RTI  [5],  [31],  [38],  [2],  [45]. 

Measurement  Model.  The  shadowing  loss  R  on  each  link  I  can  be  approximated  by 
a  sum  of  the  attenuation  that  occurs  at  each  pixel  in  the  network.  For  N  frames,  the  link 
RSS  at  frame  n  can  be  mathematically  described  as 

h 


^  N-1 

z=0 

(2.33) 

=  fin  -  hc^ 

(2.34) 

where  r/  c  is  the  calibration  RSS  on  each  link.  Therefore  the  sample  mean  for  each  link  in 
vector  notation  is 

ymean  —  •••5  •  (2.35) 
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As  discussed  in  section  2.4,  the  inverse  problem  given  y  needs  to  be  solved  to  estimate  x. 
Using  the  sample  mean  from  (2.35)  is  useful  in  locating  both  static  and  moving  targets. 
Due  to  the  noise  caused  by  walls  or  similar  solid  foundations,  this  method  is  better  suited 
for  LOS  applications  rather  than  non-LOS  applications  [36],  [37],  [44].  Additionally,  since 
mean-based  RTI  only  uses  the  change  in  the  mean  RSS,  quick  or  sporadic  movement  would 
degrade  the  accuracy  of  the  network  [2]. 

2.6.2  Variance  Radio  Tomographic  Imaging. 

Variance-based  Radio  Tomographic  Imaging  (VRTI)  utilizes  the  variance  between 
RSS  frames  from  the  RTI  network  to  estimate  the  target’s  location.  Due  to  only  the  variance 
being  needed  to  estimate  the  image  from  the  attenuation  field,  the  need  to  calibrate  the  RTI 
network  prior  to  taking  data  measurements  is  alleviated  [6],  [38],  [39]. 

The  VRTI  system  uses  a  vector  y  of  RSS  measurements  on  M  links  in  the  RTI  network 
to  determine  the  variance  between  each  frame,  where  the  variance  is  measured  in  dB.  The 
RSS  variance  of  on  each  link  can  mathematically  be  defined  as 

yAR[R,B]  =  WpjXp  +  nu  (2.36) 

p 

where  n  is  the  measurement  noise  and  modeling  error,  Wp  is  the  variance  caused  by  a 
movement  in  pixel  p,  and  Rdp  is  the  received  signal  strength.  Using  the  linear  model  from 
(2.4),  the  linear  system  for  VRTI  can  be  expressed  as 

s  =  Wx  -I-  n,  (2.37) 

where  s  is  an  Mx  I  measurement  vector  of  the  variance  of  each  link  /,  W  is  a  chosen  weight 
matrix  as  discussion  in  section  2.3,  and  x  is  the  V  x  1  scene  image  that  is  estimated  using 
a  chosen  method  as  discussed  in  section  2.4.  In  [38],  Tikhonov  regularization  is  used  for 
optimization  of  the  image  estimate  and  is  defined  as 

Xt/X  =  nr/ArS,  (2.38) 

where  H-Tik  is  from  (2.24). 
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Results.  VRTI  improves  imaging  in  through-wall  applications  as  shown  in  [6],  [38], 
[39].  In  [38],  imaging  through  the  wall  with  mean-based  RTI  was  compared  to  VRTI 
to  show  that  VRTI  had  a  less  noisy  image  through  the  walls.  VRTI  is  valuable  because 
locating  moving  targets  outside  the  walls  is  extremely  valubale  for  police,  military,  and 
rescue  teams  to  get  an  image  of  targets  inside  a  building  prior  to  entering  [10].  Kalman 
filters  can  be  utilized  to  further  increase  the  accuracy  of  target  tracking  of  moving  targets 
using  VRTI.  Since  VRTI  utilizes  the  changes  in  variance  on  the  links  in  the  network  for 
imaging  rather  than  the  change  in  mean  of  the  static  RSS  losses,  VRTI  is  not  as  viable  for 
locating  stationary  targets  as  MRTI. 

2.6.3  Kernel  Radio  Tomographic  Imaging. 

Kernel-based  Radio  Tomographic  Imaging  (KRTI)  compares  the  short-term  and  long¬ 
term  histograms  of  a  RTI  network  to  locate  the  position  of  any  targets  inside  the  network. 
This  method  has  the  benefit  of  locating  both  stationary  and  moving  targets  in  LOS  and 
non-LOS  environments.  With  this  method,  a  training  period  is  required  to  record  the 
RSS  histograms  on  all  the  links  in  the  network.  Unlike  MRTI  or  VRTI,  the  objective 
is  to  quantify  the  change  in  RSS  in  the  network  caused  by  a  person  through  the  use  of 
histograms  rather  than  the  mean  or  variance  of  the  link  RSS  [36],  [38],  [39].  Zhao  et  al. 
find  the  distance  between  short-term  and  long-term  histograms  in  a  RTI  network  using  the 
Kullback-Leibler  divergence  [2]. 

Distance  Between  Long-Term  and  Short-Term  Histograms.  In  KRTI,  every  link  I 
is  characterized  by  short-term  and  long-term  histograms  of  past  RSS  measurements.  At 
frame  n,  the  weighted  average  of  the  histogram  h  is 

=  (2.39) 

i 
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where  I  is  an  A^-length  indicator  vector  and  y'  is  the  RSS  at  time  i.  The  exponentially 
weighted  moving  average  is  given  by 

Ul-;0r',  i<n, 

WnJ  =  <  (2.40) 

0,  Otherwise, 

where  yS  is  the  forgetting  factor,  0  <  yS  <  1 .  A  higher  increases  the  importance  of  the 
most  recent  RSS  measurements,  while  a  lower  /3  would  be  more  appropriate  for  long-term 
histograms.  The  kernel  distance  is  between  the  long-term  and  short-term  histograms  is 
found  by 

Dk  (P,  q)  =  p^Kp+q^Kq  -  2p^Kq,  (2.41) 

where  K  is  the  kernel,  p  is  the  short-term  histogram  and  q  is  the  long-term  histogram 
dehned  in  [2].  A  commonly  used  kernel  is  the  Epanechnikov  kernel,  which  minimizes  the 
integrated  squared  error  [2]  and  is  defined  as 

Kfe>v)=  j  (2.42) 

where  i  and  j  are  elements  of  the  RSS  links  in  y,  I  is  the  indicator  function  and  cr|  is  the 
Epanechnikov  kernel  width. 

Kernel  Distance-Based  Radio  Tomographic  Image  Formation.  Once  the  histogram 
distances  are  computed,  d  =  [di, ...,  Amf'  can  denote  the  histogram  difference  of  all  links, 
M  histogram  differences,  where  di  =  Z)  (pi,  qO.  Using  the  RTI  linear  model  from  (2.4),  d 
is  defined  as 

d  =  Wx  +  n,  (2.43) 

where  n  is  the  noise  vector  and  W  is  a  chosen  weight  model  discussed  in  Section  2.3  [1], 
[32],  [38],  [42],  [46].  The  histogram  difference  vector,  d  is  used  to  form  the  image  x,  which 
has  the  modihed  least-squares  solution 

X  =  V^C„d,  (2.44) 
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where  C^c  is  the  covariance  matrix  of  x  and  C„  is  the  covariance  matrix  of  each  link’s 
measurement  noise  [2].  Using  Tikhonov  regularization  [21],  the  matrix  notation  of  the 
modihed  least-squares  formulation  is 

Xif  =  n^d,  (2.45) 

where 

=  (w^w  +  (2.46) 

and  the  variance  of  the  measurement  noise  is  given  by  cr^. 

Results.  In  [2],  KRTI  is  used  to  track  a  moving  target  with  the  addition  of  a  Kalman 
filter.  The  transition  model  of  the  Kalman  filter  includes  the  target’s  location  and  velocity. 
A  variety  of  experiments  were  performed  which  included  a  bookstore  environment  which 
had  bookshelves  as  obstructions  and  a  large  living  room  in  a  residential  setting.  The 
experiments  used  34  radios  with  twenty  locations  being  estimated  in  each  experiment.  The 
overall  average  error  at  each  estimated  location  z,  was  calculated  by 

1 

^  =  :^J]\\zi-Zi\\  (2.47) 

i=l 

where  z,  is  the  true  location.  In  the  experiments  performed,  it  was  found  that  KRTI  had 
a  lower  average  location  error  than  VRTI  and  Sub-VRTI  [6],  [37],  [38].  Overall,  KRTI 
offered  over  a  30%  improvement  over  VRTI  and  Sub-VRTI  [2].  Stationary  experiments 
were  also  performed  and  had  an  average  location  error  of  less  than  0.81  meters. 

2.6.4  Other  Radio  Tomographic  Imaging  Methods. 

There  have  been  other  methods  of  RTI  that  have  been  explored  in  literature.  The  other 
methods  described  in  this  section  utilize  a  channel  that  is  chosen  by  the  user.  In  [40], 
Kaltiokallio  et  al.  propose  a  method  to  select  an  optimal  channel  so  that  the  reliability  of 
the  links  is  maximized.  Experimental  results  show  that  channel  diversity  can  increase  the 
accuracy  of  a  network.  Histogram-based  Radio  Tomographic  Imaging  (HRTI)  was  first 
demonstrated  in  [47]  and  is  the  foundation  for  KRTI  which  uses  the  distances  between 
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Table  2.1:  Radio  Tomographic  Imaging  features  [2]. 


Features 

RTI 

VRTI 

KRTI 

Through  wall? 

Yes 

Yes 

Yes 

Calibration? 

Yes 

N/A 

No 

Stationary  Targets? 

Yes 

No 

Yes 

Real-Time? 

Yes 

Yes 

Yes 

link  histograms  to  estimate  the  image.  In  [48],  the  use  of  directional  antennas  is  used  in 
the  sensors  of  a  network.  This  method  is  known  as  Direction-based  Radio  Tomographic 
Imaging  (DRTI)  and  was  proposed  to  improve  the  localization  accuracy  of  RTI.  Since 
the  number  of  pairs  grow  quite  large  with  a  greater  number  of  sensors,  a  lower  number 
of  sensors  in  the  network  would  need  to  be  used  for  this  to  be  feasible.  However,  the 
experimental  results  showed  that  DRTI  can  improve  the  accuracy  over  omni-directional 
antennas  in  both  LOS  and  non-LOS  environments. 

2.6.5  Radio  Tomographic  Imaging  Features. 

Since  there  are  multiple  RTI  methods  that  have  been  explored,  each  one  has  features 
available  that  could  be  appropriate  for  different  situations.  Table  2. 1  shows  various  features 
for  shadow-based  RTI,  VRTI,  and  KRTI.  In  settings  where  imaging  will  need  to  be  done 
through  the  wall,  VRTI  or  KRTI  would  be  the  preferred  methods  over  shadow-based  RTI 
[2].  Additionally,  VRTI  and  KRTI  do  no  require  calibration,  but  RTI  does.  This  could  be  a 
drawback  in  an  emergent  situation  where  taking  the  time  to  calibrate  may  not  be  feasible. 
MRTI  has  the  benefit  of  estimating  the  location  of  stationary  targets  over  VRTI.  When  the 
target  is  stationary,  there  would  not  be  a  significant  variance  in  the  links  of  the  network  to 
accurately  locate  the  target  [6],  [39]. 
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2.7  Localization  Methods 


Due  to  RTI  being  an  ill-posed  inverse  problem,  there  is  not  a  stable  and  unique  solution 
to  the  least-squares  formulation  for  x.  Onee  a  regularization  and  estimator  is  ehosen,  x 
needs  to  be  analyzed  to  loeate  the  target(s)  of  interest.  Depending  on  how  many  targets 
are  in  the  seene,  there  are  two  eommonly  used  estimators  in  RTI  applications  to  locate  the 
target(s)  [12],  [38],  [2],  [49], 

Maximum  A  Posterior  Estimation.  For  a  single  target  application,  a  Bayesian 
statistic  on  the  estimated  scene,  x  can  be  applied  to  estimate  the  location  of  the  target 
[49].  The  MAP  estimated  location  would  be  the  pixel  with  the  maximum  value.  The 
mathematical  notation  for  this  estimate  is 

z  =  argmaxxp,  (2.48) 

p 

where  £  is  the  estimated  location  of  the  target  and  Xp  is  the  pixel  intensity  at  each  pixel  p. 

K-Means  Clustering.  K-means  clustering  is  a  popular  data  mining  tool  to  find 
patterns  or  clusters  of  interest  from  a  set  of  data.  It  is  popular  among  machine  learning 
applications,  pattern  recognition,  hyper- spectral  imagery,  artificial  intelligence,  crowd 
analysis,  and  MTT  [11],  [12],  [13].  The  K-means  algorithm  clusters  a  given  set  of  data 
together  into  K  partitions  with  the  goal  of  minimizing  the  variance  of  each  cluster.  This 
algorithm  is  similar  to  the  expectation-maximization  algorithm  where  the  end  goal  is  to  find 
the  optimal  center  of  the  defined  number  of  clusters  [13].  K-means  is  an  iterative  process 
where  the  objective  is  to  minimize  the  total  inter-cluster  variance.  This  process  assumes  a 
fixed  a  priori,  K  for  the  number  of  clusters  to  be  found  from  the  given  data  set.  Therefore, 
the  objective  function,  7  is  a  squared  error  function  which  is 

K 

■^ = Z  Z 11^.'  - 

(=1  xjeSi 
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where  Xj  is  the  set  of  data  to  be  separated  into  clusters,  K  is  the  number  of  clusters,  n  is  the 
number  of  cases,  S,-  is  the  set  of  pixels  assigned  to  cluster  i  and  C,  is  the  centroid  for  cluster 
i  [13].  The  algorithm  is  performed  by  the  following  steps  [50]: 

1 .  Place  K  points  into  the  spatial  area  represented  by  the  points  that  are  being  clustered; 
the  points  represent  the  initial  clusters  and  centroids. 

2.  Assign  each  object  to  the  group  with  the  closest  centroid  using  the  squared  error 
function  from  (2.49). 

3.  When  all  objects  have  been  assigned,  recalculate  the  positions  of  the  K  centroids. 

4.  Repeat  steps  2  and  3  until  the  centroid  positions  converge. 

Separation  of  all  points  in  the  data  set  is  obtained  when  all  objects  are  assigned  to  to  a 
cluster  by  minimizing  the  euclidean  distance  of  all  the  points  in  the  data  set  to  the  cluster 
centroid.  This  process  can  be  extremely  fast,  since  in  practice,  it  is  repeated  less  than  n 
times  [13]. 

There  are  drawbacks  to  this  iterative  process.  In  terms  of  performance,  K-means  does 
not  guarantee  to  return  a  global  optimum.  Since  the  heuristic  algorithm  described  in  this 
section  starts  with  a  random  initialization,  the  final  solution  is  sensitive  to  the  initial  set  of 
clusters.  If  the  number  of  K  values  is  inappropriately  chosen,  the  algorithm  can  produce 
poor  results.  Therefore,  the  algorithm  relies  heavily  on  picking  a  value  of  K  that  would 
yield  desired  results  [13].  In  [50],  Chen  and  Shixiong  propose  an  improved  method  to  pick 
the  initial  centers  of  the  clusters.  This  method  proposes  picking  initial  centroids  already 
close  to  large  quantities  of  points. 

2.7.1  Multiple  Target  Tracking. 

Although  single  targets  are  mainly  used  in  literature  [6],  [31],  [42],  [51],  MTT  in  RTI 
has  started  to  be  explored  in  [9],  [10],  [11].  Bocca  et  al.  explore  real  time  tracking  with 
multiple  targets  using  RTI  [8].  Channel  diversity  from  [40]  is  utilized  in  conjunction  with 
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machine  vision  methods  to  track  multiple  targets.  The  results  presented  are  from  an  open 
environment,  a  one  bedroom  apartment,  and  a  crowded  office  environment  to  demonstrate 
the  capability  to  perform  MTT  with  obstructions. 

Pixel  Threshold.  When  multiple  targets  are  present,  there  are  blobs  of  pixels  in  the 
image  scene  that  go  through  a  clustering  process  to  estimate  the  location  of  the  targets. 
Dynamic  thresholding  is  used  to  reduce  the  size  of  the  pixels  that  go  through  the  clustering 
process.  In  [8],  an  algorithm  is  used  to  threshold  pixels  prior  to  being  clustered.  In  an 
empty  network,  the  average  maximum  intensity  of  the  formed  RTI  images  is  used  as  the 
baseline,  4.  The  threshold  is  set  to  24  in  order  to  disregard  the  pixels  with  low  intensity. 
When  targets  are  being  tracked,  the  minimum  intensity,  Imin  for  targets  T  =  (ti, ...,  t\T\)  is 
defined  as 

/n,i„  =  min  [xg](,  (2.50) 

where  xg  is  the  image  scene  after  being  filtered  through  a  low-pass  Gaussian  kernel  G.  The 
filtered  RTI  image  xg  is  calculated  as 


Xg  =  X  *  G, 


(2.51) 


where  G  is  the  Gaussian  kernel  and  *  is  the  convolution  operator.  The  Gaussian  kernel  is 
defined  as 


G{x,y) 


1 


Incr^ 


exp 


+y^ 
2(tI  j 


(2.52) 


where  ctg  is  the  standard  deviation  of  the  Gaussian  kernel  which  is  set  to  be  1  meter  [8]. 
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2.8  Chapter  Summary 

This  chapter  explained  the  various  of  forms  of  geolocation,  which  have  led  to  the 
work  accomplished  in  DFL.  The  background  of  RTI  was  discussed  in  this  section  as  well 
as  the  the  various  RTI  methods.  In  addition,  the  signal  processing  models,  noise  models, 
regularization,  and  weighting  models  have  provided  the  foundation  for  the  different  forms 
of  RTI  found  in  the  literature.  Once  an  image  is  estimated  from  the  information  received 
from  the  network,  a  method  such  as  a  MAP  estimator  or  K-means  needs  to  be  applied 
to  estimate  the  location  of  the  target(s).  The  weighting  model,  measurement  model,  and 
regularization  used  in  this  research  have  been  described  in  this  chapter.  Since  MTT  is  the 
focus  of  this  research,  this  chapter  described  the  K-means  algorithm  and  how  it  can  be 
applied  to  MTT. 
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III.  Methodology 


This  chapter  describes  the  methodology  used  in  this  research  to  establish  and  collect 
data  from  a  RTI  network  comprised  of  multiple  RFIC  motes.  The  following 
sections  outline  the  hardware  and  tools  used  in  all  the  data  collections.  The  system  model 
and  implementation  of  the  research  will  be  described  in  this  chapter.  Truth  data  using 
simulation  and  experimentation  will  be  used  to  get  a  baseline  performance  of  the  network. 
Lastly,  the  methods  in  which  the  data  will  be  analyzed  post-data  collection  will  be  outlined 
in  this  section. 

For  all  data  collections,  feet  (ft)  will  be  used  as  the  metric  for  distance.  All  RSS 
values  will  be  assumed  to  be  in  dBm.  For  the  two-dimensional  (2-D)  RTI  network,  the 
pixel  size  will  be  ApX  Ap.  The  x  -  y  plane  will  be  utilized  to  show  the  2-D  pixel  layout  of 
all  images.  Therefore,  all  position  and  tracking  coordinate  estimation  will  be  denoted  by 
an  (x,  y)  coordinate  in  feet. 

3.1  Equipment  and  Tools 

The  equipment  used  in  this  research  includes  the  Memsic  TelosB  mote  platform  [3] 
and  a  computer  with  Microsoft  Windows®  7  for  data  collection  and  processing.  The  tools 
that  were  used  in  this  research  are  described  and  listed  below.  Data  collection,  simulation, 
and  analysis  of  all  data  were  completed  in  MATLAB®. 

Memsic  TelosB  Mote  Platform.  The  wireless  radios  used  in  this  research  are  made 
by  Crossbow  Technology  Incorporated  (Inc.)  based  in  San  Jose,  California.  The  model 
utilized  in  the  research  is  the  TelesB  mote  TPR2400.  University  of  California  (UC) 
Berkeley  developed  the  the  open-source  radios  and  is  compatible  with  TinyOS  distrubution. 
TPR2400  was  developed  for  the  research  community  and  provides  the  users  with  the 
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capability  to  interface  with  additional  devices.  The  radios  offer  programming  and  data 
collection  through  a  Universal  Serial  Bus  (USB)  interface. 


Table  3.1:  Select  TPR2400  specifications  [3]. 


Module 

RAM 

lOK  bytes 

Current  Draw 

1.8  mA 

RF  Transeiver 

Frequency  Band 

2400  Mhz  to  2483.5  Mhz 

RF  Power 

-24  dBm  to  0  dBm 

Outdoor  Range 

70  m  to  100  m 

Curend  Draw  (Receive  Mode) 

23  mA 

Electromechanical 

Size 

2.55  X  1.24  X  .24  inches 

Weight 

0.8  ounces 

User  Interface 

USB 

Cygwin.  The  motes  were  programmed  using  Cygwin  on  a  Microsoft  Windows®  7 
64-bit  machine.  Cygwin  is  a  collection  of  GNU  and  open-source  tools.  It  is  a  Unix-like 
environment  which  is  used  to  interface  with  Microsoft  Windows®  7.  Cygwin  was  originally 
developed  by  Cygnus  Solutions,  but  has  been  acquired  by  Red  Hat  [52]. 

Tiny  OS.  Tiny  Operating  System  (OS)  is  an  open-source  operating  system  designed 
for  low-power  wireless  devices.  The  TelosB  motes  were  equipped  with  Tiny  OS  which 
is  written  in  NesC  [53].  TinyOS  includes  the  program  file  filled  “BaseStation,”  for 
programming  the  mote  acting  as  the  network  BaseStation.  Any  mote  can  act  as  either  a 
wireless  radio  in  the  network  or  the  base  station,  but  this  is  specihed  when  programmed. 
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Figure  3.1:  TelosB  Mote. 


Spin.  The  motes  surrounding  the  Aol  as  well  as  the  base  station  were  programmed 
with  the  “Spin”  protocol,  created  hy  the  Sensing  and  Processing  Across  Networks  (SPAN) 
lab  at  the  Department  of  Electrical  and  Computer  Engineering  at  the  University  of  Utah. 
Spin  is  an  open-source  TinyOS  program  written  in  NesC.  This  program  has  the  function 
of  collecting  RSS  information  from  a  WSN  using  a  token  passing  protocol.  Spin  has 
specifically  been  tested  with  TelosB  nodes.  With  this  token  passing  protocol,  only  one 
radio  transmits  at  a  time  through  the  channel;  the  motes  transmit  in  the  order  specified  by 
the  user.  Eor  more  information  or  to  download  the  Spin  program,  refer  to  [54]. 

RTI  LINK  GUI.  Data  was  collected  using  the  RTI  LINK  Graphical  User  Interface 
(GUI)  with  the  initial  version  created  by  Mr.  Alex  Eolkerts  (Southwestern  Ohio  Council 
for  Higher  Education  (SOCHE)  Intern),  Mr.  Tyler  Heinl  (SOCHE  Intern),  and  Dr.  Richard 
K.  Martin  (Associate  Professor  of  Electrical  Engineering  at  the  Air  Eorce  Institute  of 
Technology  (AEIT)).  The  RTI  LINK  GUI  is  a  MATLAB®  based  application  designed  to 
collect  and  save  package  data  from  the  RTI  network.  The  GUI  receives  the  raw  RSS 
data  through  the  base  station  two’s  complement  and  converts  the  values  to  hexadecimal. 
The  collection  of  each  link’s  RSS  values  at  each  frame  n  is  considered  the  vector  y  = 
[yi ,  y2,  •  •  •  ,  SmV  from  section  2.2.4.  Eor  MRTI,  the  GUI  is  capable  of  taking  the  raw  y  data 
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at  each  frame  and  subtracting  the  calibration  data  to  provide  the  mean  change  in  RSS.  The 
line  weighting  matrix,  W  from  section  2.3  and  the  Tikhonov  Pi  matrix,  'Hjik  from  section 
2.4  can  be  calculated  prior  to  recording  data  to  save  computational  time.  User  specified 
parameters  such  as  the  tunable  parameter  a,  pixel  size,  Ap  are  inputted  prior  to  calculating 
the  weighting  matrix  and  Tljik  matrix.  The  GUI  collects  the  raw  link  data  at  each  frame  in 
real  time  and  uses  the  ^jik  matrix  as  a  linear  operator  to  output  the  estimated  image  x  in 
near  real  time.  The  calibration  data  and  final  recorded  data  can  be  saved  in  the  form  of  raw 
link  RSS  data  to  provide  the  flexibility  to  compare  different  user  parameters  such  as  pixel 
size  and  regularization  values. 


3.2  Network  Setup 

All  experimental  data  in  this  research  was  taken  from  the  same  RTI  network.  The 
network  covered  a  19  ft  x  16  ft  area  surrounded  by  A  =  70  motes  described  in  Section  3.1. 
All  the  motes  were  placed  1  ft  apart  around  the  parameter  of  the  network  area.  The  motes 
were  mounted  on  stands  made  from  Polyvinyl  Chloride  (PVC)  all  at  a  height  of  3.33  ft. 
The  height  of  the  sensors  was  chosen  to  be  near  the  midsection  of  most  adults.  Inside  the 
network,  painters  tape  was  used  to  mark  off  coordinates  so  that  the  true  position  of  targets 
inside  the  network  could  easily  be  known.  An  illustration  of  the  mote  topology  is  shown  in 
Figure  3.2.  The  number  of  unique  links  can  be  determined  by  (2.1).  Therefore,  the  number 
of  unique  links  for  A  =  70  nodes  is 

A^  -  A  70^  -  70 


M  = 


=  2, 415  links. 


(3.1) 


Figure  3.3  illustrates  M  =  2415  links  for  the  experimental  RTI  network.  The  sensors  were 
equally  spaced  apart  to  maximize  the  accuracy  consistency  throughout  the  entire  network. 
In  addition,  the  more  nodes  used,  the  more  RSS  link  information  would  be  available  to 
estimate  the  image  x.  Figures  3.4a  and  3.4b  are  pictures  of  the  network  from  two  different 
comers. 
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'  Wireless  Node 


(a)  Aerial  view  of  mote  locations 


(b)  3-D  View  from  (0,0) 


Figure  3.2:  Aerial  and  three-dimensional  views  of  mote  topology. 
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Figure  3.3:  Experimental  setup  with  M  =  2415  links. 


Radio  Orientation.  The  TelosB  TPR2420  are  equipped  with  omni-directional 
antennas  [55].  However,  to  he  consistent,  all  motes  were  oriented  in  the  same  manner. 
The  motes  were  positioned  vertically  with  the  USB  interface  facing  towards  the  ground. 
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(a)  View  from  (19,16) 


(b)  View  of  (19,0) 


Figure  3.4:  Pictures  of  the  RTI  experimental  network. 


Human  Subjects.  Human  subjects  were  used  in  this  research.  Required  human 
subjects  training  has  been  completed  by  the  principal  investigators  per  the  AFIT  RTI 
protocol.  The  signed  Informed  Consent  Document  (ICD)  of  all  human  subjects  are 
approved  approved  by  the  Air  Force  Research  Laboratory  (AFRL)  Institutional  Review 
Board  (IRB).  The  signed  ICD  is  available  and  all  human  subjects  voluntarily  participated 
in  the  data  collection.  All  human  targets  were  localized  in  the  upright  position.  Although 
the  height  of  each  target  varied,  the  heights  were  not  taken  into  consideration  as  the  sensors 
were  placed  at  a  height  that  would  be  obstructed  by  targets  of  various  heights. 

3.3  Assumptions 

The  following  are  the  assumptions  that  were  made  in  this  research: 

1.  N(P{di),(T^) 

2.  n~  Af(0,cr2lM) 

3.  y|x  ~  Af  (Wx, 

4.  x~Af(0,C,) 
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5.  Calibration  for  all  data  collection  was  completed  for  at  least  30  sets  of  y  observations 
and  is  available. 

6.  Measurement  noise  and  static  losses  are  averaged  out  through  the  use  of  the  mean  of 
the  calibration  data. 

7.  All  radios  were  oriented  in  the  vertieal  position  with  the  USB  interface  faeing  down. 

8.  All  human  targets  are  traeked  in  the  upright  position. 

9.  The  number  of  targets  is  known. 

10.  Tracking  hlters  for  tracking  or  noise  reduetion  are  not  used  in  this  researeh. 

1 1 .  Fade  loss  as  a  result  of  multipath  is  not  signiheant  enough  to  to  be  ineorporated  in 
the  weighting  model  or  regularization  method. 

12.  Obstruetions  in  the  network  affeet  signal  propagation  between  links  as  a  result  of 
shadowing  loss. 

13.  The  pixel  intensity  of  eaeh  pixel  in  the  estimated  seene  x  is  constant  through  the  area 
of  the  pixel. 

3.4  System  Models 

This  researeh  applied  MRTI  from  seetion  2.6.1,  where  'jmean  =  [Afi  Af2,„, ...,  Ar^.n]^- 
Therefore,  y  is  computed  by  y  =  y^ean  -  yc-  The  linear  system  model  is  defined  by  (2.4).  A 
weight  model  and  estimator  needs  to  be  ehosen  to  estimate  x. 

Weight  Model.  The  Line  Model  was  chosen  for  the  weight  model,  W  from  Seetion 
2.3.2.  Although  the  NeSh  Normalized  Ellipse  model  from  Seetion  2.3.1  is  the  more  popular 
weighting  model  in  literature,  the  Line  Model  was  ehosen  due  to  its  lower  complexity. 
Sinee  the  localization  algorithm  that  is  applied  in  this  researeh  adds  additional  eomplexity, 
eutting  down  on  eomplexity  ean  be  benefieial  for  real-time  applieations.  This  model  assigns 
a  weight  dependent  on  the  path  lengths  of  the  links  passing  through  the  obstruction. 

Regularization.  Tikhonov  Regularization  from  Section  2.4. 1  was  used  to  estimate  x 
in  the  least-squares  sense.  The  first  order  difference  operator  Q  is  diseussed  in  Seetion  2.4.1 
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and  the  tunable  regularization  parameter  a  was  ehosen  after  data  collection.  As  discussed 
in  [1],  the  optimal  a  is  dependent  on  the  network  setup  and  pixel  size.  The  results  of  the 
regularization  is  presented  in  Chapter  4. 

3.5  Localization  Method 

The  focus  of  this  research  utilizes  the  K-means  algorithm  to  cluster  together  higher 
intensity  pixels  to  estimate  the  position  of  targets.  For  single  target  applications,  the  most 
common  form  of  localization  is  taking  the  maximum  pixel  value  from  the  estimated  image 
scene  x  as  discussed  in  Section  2.7.  However,  for  MTT,  localization  is  more  difficult.  When 
multiple  targets  are  present,  there  should  be  multiple  areas  where  the  pixel  intensity  would 
be  higher  in  the  areas  in  which  targets  are  present. 

K-means  Clustering.  Since  the  primary  focus  of  this  research  is  multiple  target 
localization,  K-means  clustering  from  Section  2.7  will  be  utilized  to  estimate  the  location  of 
multiple  targets.  When  multiple  targets  are  present  in  an  image  scene,  the  higher  intensity 
pixels  can  be  clustered  together  and  localized  using  K-means.  The  associated  squared  error 
cost  function  for  K-means  is 

K 

■^ = z  z  iiv  -  oir.  (3.2) 

i=\  XjESi 

where  C,  is  the  cluster  position  of  cluster  k  and  Xj  is  the  is  the  jth  element  of  the  pixels 
above  a  set  threshold  to  be  assigned  to  a  cluster.  The  number  of  clusters  K  and  the  pixel 
locations  to  be  clustered,  xj  are  the  inputs  to  the  K-means  algorithm.  Since  it  is  assumed 
the  number  of  targets  is  known,  apriori,  K  can  be  appropriately  chosen  for  the  number  of 
targets  known  to  be  present  inside  the  network. 

Pixel  Intensity  Threshold.  Since  the  desired  pixels  are  those  that  have  a  higher 
intensity,  some  type  of  threshold  is  warranted  to  segregate  the  pixels  that  have  no  targets 
present  from  the  pixels  that  are  occupied  by  targets.  From  experimental  data,  it  was  found 
that  the  statistics  on  the  image  x  change  with  the  regularization  parameter  a,  pixel  size  Ap, 
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the  number  of  targets,  and  where  in  the  network  the  targets  are.  Therefore,  the  threshold 
has  to  be  robust  enough  to  accommodate  the  change  in  intensity  values  between  frames  and 
flexible  to  handle  the  change  of  chosen  parameters.  The  threshold  Tc  is  used  to  determine 
which  pixels  are  clustered  into  the  K-means  algorithm.  The  variance  is  first  found  from  the 
estimated  image  scene  which  is 

al  =  VAR{xTik,n\,  (3.3) 

where  XTik,n  is  the  Tikhonov  estimation  at  each  frame  n.  The  image  scene  can  be  modeled 
as  Gaussian  [1].  Pixels  that  are  occupied  by  targets  can  be  assumed  to  be  much  greater  than 
the  pixels  unoccupied  by  targets.  The  threshold  Tc  mainly  used  for  this  research  is 

Tc  =  3cr„,  (3.4) 

where  3cr„  is  three  times  the  standard  variation  at  each  frame  n. 

In  summary: 

•  System  Model:  y  =  Wx  +  n 

•  Measurement  Model:  y  =  [Ari,  Ar2,  •  •  •  ,  AtmY 

•  Calibration:  y^  =  [r^j,  fc^,  •  •  •  ,  fc^mY 

11  if  link  I  traverses  voxel  p 

0  otherwise 

•  Estimator:  Xjjk  =  argmin  (||Wx  -  y|p  +  Qr||Qx||^j 

•  Tikhonov  Matrix:  Q  =  +  DjDj,  +  DjDj, 

•  Pixel  Threshold:  Tc  =  3cr„,  cr„  =  VAR[xjik^n\ 

K  ,,  ,.2 

•  Localization:  -^  =  Z  Z  lFi  “ 

Z=1  XjESi 

3.6  Choosing  Model  and  Experiment  Parameters 

Trade-off  analysis  has  been  conducted  in  [56]  and  [1]  for  model  parameters.  However, 
model  parameters  a  and  Ap  based  off  of  review  from  preliminary  results.  Data  from  these 
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experiments  were  analyzed  using  a  range  of  values  for  a  and  Ap  and  the  resulting  Root 
Mean  Squared  Error  (RMSE)  for  each  data  set  was  compared. 

3.7  Simulated  Truth  Data 

The  Cylindrical  Human  model  descrihed  in  Section  2.5  was  used  to  simulate  all  the 
stationary  truth  images.  C/,  is  set  to  he  the  known  (x,y)  coordinates  of  the  targets  to  be 
localized.  The  model  can  mathematically  described  as 

1,  if\\{x,y)p-Ch\\<Rh 

WcwM  =  j  ,  (3.5) 

0,  otherwise 

where  's.chm  is  a  [Lx,Ly\  matrix  set  by  the  pixel  size  Ap,  {x,y)p  is  the  center  coordinate 
of  each  pixel  p,  and  Rh  is  the  human  radius.  Therefore,  the  true  attenuation  image  model 
Xchm  contains  a  1  in  the  pixel  location  that  is  centered  on  C/,  contained  within  R//  and  zeros 
elsewhere.  The  simulated  ysim  data  is  calculated  by  the  linear  model  (2.4)  from  Section 
2.2.4.  The  vector  ysim  can  mathematically  be  shown  as 

y  sim  —  line  ^CHM  ^sim  ?  (3.6) 

where  iisim  is  a  simulated  AWGN  vector  of  variance  Using  ysim  from  (3.6),  the 

simulated  image  scene  using  Tikhonov  Regularization  is 

X,™  =  {y^lne'^Une  +  t^Q)  ^  (3.7) 

where  Q  and  the  line  weighting  matrix  are  dehned  in  Section  3.4.  Substituting  the 
Tikhonov  matrix  from  Section  2.4.1  yields 

X^im  ^^Tiky  sim^  (3.8) 

where  Tlsim  can  be  computed  in  advance  for  both  simulation  and  real-time  applications. 
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Table  3.2:  Simulated  truth  data  parameters. 


Parameter 

Value 

Description 

10 

Noise  variance  {dB^) 

Rh 

1.1 

Human  radius  for 

cylindrical  model  {ft) 

Ap 

0.25  &  0.5 

Pixel  width  {ft) 

Table  3.2  shows  the  parameters  used  in  the  simulated  truth  data.  As  shown  in  Figures 
3. 5-3. 7  with  almost  no  regularization,  where  a  =  25,  the  noise  in  the  network  is  more 
apparent  in  the  estimated  image  scene.  As  discussed  in  Section  2.4.1,  the  purpose  of  a  is  to 
suppress  the  noise  spikes  in  the  image.  In  [1],  or  was  varied  for  the  given  network  until  the 
MSE  was  minimized.  In  this  research,  a  was  increased  until  a  visually  acceptable  image 
scene  was  found.  The  advantage  of  a  is  that  it  can  be  changed  by  the  user  in  both  real-time 
and  when  analyzing  the  data  post  collection.  As  seen  in  Figures  3. 5-3. 7,  the  desired  a  is 
changed  with  pixel  size.  As  the  pixel  size  grows  smaller,  the  energy  is  spread  throughout  a 
higher  number  of  pixels  and  thus  the  properties  of  the  image  scene  changes.  For  a  pixel  size 
of  Ap  =  0.5  ft,  the  optimal  regularization  parameter  was  found  to  be  or  =  250.  However, 
when  Ap  =  0.25  ft,  the  optimal  regularization  parameter  was  found  to  be  or  =  150. 
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Y  [ft]  Y  [ft] 


Figure  3.5:  Truth  Images:  Single  target  at  (4, 9)  ft. 
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Y  [ft]  Y  [ft] 


Figure  3.6:  Truth  Images:  Two  targets  at  (6,4)  and  (12, 10)  ft. 
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Y  [ft]  Y  [ft] 


Figure  3.7:  Truth  Images:  Three  targets  at  (5, 5),  (8, 12)  and  (15, 3)  ft. 
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Y  [ft]  Y  [ft]  Y  [ft] 


Pixel  Density 

(b)  Histogram  of  Figure  3.8a,  Tc  -  0.197  dB/ft. 


Pixel  Density 

(d)  Histogram  of  Figure  3.8c,  Tc  -  0.201  dB/ft. 


0  5  10  15 

X[ft] 


0  0.05  0.1  0.15  0.2  0.25  0.3 

Pixel  Density 


(e)a  =  150,  Ap  =  .25  ft. 


(f)  Histogram  of  Figure  3.8e,  Tc  =  0.164  dB/ft. 


Figure  3.8:  Histograms  of  frames  with  varying  targets  and  parameters. 
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Pixel  Density 


(a)  Histogram,  Tc  -  4cr„. 


(c)  Histogram,  Tc  -  (Tn- 


(d)  Plot  of  pixel  locations  above  Tc  -  cr„. 


Pixel  Density 


(e)  Histogram,  Tc  -  3crn. 


(f)  Plot  of  pixel  locations  above  Tc  -  3crn. 


Figure  3.9:  Histogram  of  Fig.  3.8c  with  varying  threshold  values  Pixel  locations  of  the 
pixels  above  the  threshold  are  plotted  to  the  right  of  the  histograms. 
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3.7.1  Cluster  Threshold. 


Fig.  3.8  shows  three  different  histograms  for  three  frames  of  truth  images.  The  frames 
vary  in  target  location  as  well  model  parameters,  Ap  and  a.  The  histograms  are  formed 
with  30  bins  and  are  fitted  to  a  Gaussian  curve.  In  Fig.  3.8d,  the  threshold  using  (3.4)  is 
Tc  =  0.201  dB/ft.  In  Fig.  3.8f,  the  targets  are  in  the  same  location  as  in  Fig.  3.8d,  but  the 
parameters  Ap  and  a  are  different.  Subsequently,  the  statistics  shown  in  the  histogram  are 
different.  Therefore,  it  makes  statistical  sense  that  the  threshold  for  Fig.  3.8f  has  changed 
to  Tc  =  0.164  dB/ft.  The  threshold  value,  Tc  needs  to  be  computed  before  each  frame  since 
the  statistics  of  the  image  scene  x  can  change  from  frame  to  frame. 

Fig.  3.9  illustrates  the  outcomes  after  changing  threshold  for  the  same  image  frame. 
In  Fig.  3.9c,  when  Tc  =  4cr„,  the  threshold  is  set  too  high,  where  an  insufficient  amount 
of  pixel  densities  are  above  the  threshold  in  Fig.  3.9d.  In  Fig.  3.9a,  when  Tc  =  4cr„,  the 
threshold  is  set  too  low,  where  a  higher  than  necessary  amount  of  pixel  densities  are  above 
the  threshold  in  Fig.  3.9b.  If  the  threshold  is  not  stringent  enough,  lower  pixel  densities  can 
cause  the  K-means  clustering  process  to  cluster  together  insignificant  pixels.  As  shown  in 
Fig.  3.9e,  the  essential  pixel  densities  make  make  it  past  the  threshold  when  Tc  =  3cr„. 

3.7.2  Application  of  K-means  Clustering. 

Once  the  pixels  above  the  threshold  are  identihed,  the  locations  of  those  pixels  are 
inputted  into  the  K-means  algorithm  from  (3.2).  Fig.  3.10  illustrates  a  set  of  images  of 
frames  with  various  targets  in  the  left  column.  In  the  right  column  are  plots  of  the  pixel 
locations  with  densities  above  Tc  =  3cr.  Given  the  prior  information,  K,  the  (x,  y)  locations 
are  assigned  to  a  cluster  and  the  centroids  represent  the  localization  estimate  of  each  target. 

First  Iteration  of  K-means  Clustering.  Figs.  3.11a,  3.12a,  and  3.13a  represent  the 
first  iteration  of  the  K-means  algorithm.  Since  all  pixels  above  the  threshold  Tc  are  assigned 
to  a  cluster,  there  is  a  chance  of  outliers  being  above  the  threshold.  Erroneous  pixel  density 
spikes  can  occur  in  the  image  estimation.  Fig.  3.11a,  3.12a,  and  3.13a  show  the  centroid 
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positions  of  all  the  pixels  that  are  above  the  threshold,  T^.  The  figures  show  that  there  are 
pixels  that  are  segregated  from  the  denser  elusters  whieh  are  around  the  targets.  The  pixels 
that  are  not  close  to  a  larger  group  of  pixels  can  subsequently  affect  the  centroid  positions. 
Using  a  cluster  radius  of  Rc  =  3.25  ft  would  minimize  the  effects  of  segregated  higher  pixel 
values.  Any  pixels  that  lay  outside  the  given  radius  from  the  initial  cluster  centroids  can  be 
deleted  from  the  centroid  calculation. 

Second  Iteration  of  K-means  Clustering.  Figs.  3.11b,  3.12b,  and  3.13b  illustrates 
the  second  iteration  of  K-means  after  pixels  outside  of  the  Rc  radius  are  deleted  from  the 
K-means  algorithm.  The  clusters  which  contained  pixels  with  a  greater  number  of  outliers 
had  initial  centroids  that  were  further  away  from  the  denser  pixel  clusters.  After  the  outliers 
were  deleted,  the  centroids  moved  closer  to  the  groups  of  pixels  closer  to  the  target.  In  Fig. 
3.11,  the  RMSE  was  1.71  ft.  After  the  pixels  outside  the  Rc  =  3.25  were  ignored  and  the 
second  K-means  algorithm  was  run,  the  RMSE  improved  to  0.45  ft.  Similarly,  in  Eig.  3.12, 
the  RMSE  improved  from  0.90  ft  to  0.38  ft.  In  Eig.  3.13,  the  RMSE  improved  from  1.16 
ft  to  0.45  ft.  However,  it  is  important  to  note  that  in  these  cases,  the  RMSE  error  improved 
because  the  pixel  outliers  were  pulling  the  centroid  locations  away  from  the  true  position 
of  the  targets.  In  these  cases,  the  denser  groups  of  pixels  were  closer  to  the  true  target 
positions.  Conversely,  if  the  denser  groups  of  pixels  are  not  near  the  true  target  locations, 
a  second  iteration  of  K-means  clustering  may  or  may  not  provide  an  improved  localization 
estimate. 
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Figure  3.10:  Image  seenes  and  pixel  threshold  loeations  with  multiple  targets. 
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(a)  First  iteration  of  K-means  clustering. 
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(b)  Second  iteration  of  K-means  clustering. 


Figure  3.1 1:  K-means  Localization:  target  at  (9,  8)  ft. 
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(a)  First  iteration  of  K-means  clustering. 
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(b)  Second  iteration  of  K-means  clustering. 


Figure  3.12:  K-means  Localization:  targets  at  (5, 5)  and  (9,  8)  ft. 
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(b)  Second  iteration  of  K-means  clustering. 


Figure  3.13:  K-means  Localization:  targets  at  (5, 5),  (9,  8),  and  (14, 4)  ft. 
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3.8  Experiment  Design 

This  research  focuses  on  a  variation  of  experiments  with  the  designed  WSN  discussed 
in  Section  3.2.  Single  target  stationary  and  tracking  localization  is  used  to  compare  K- 
means  clustering  with  MAP  localization.  Since  MTT  is  more  complex  and  cannot  use 
MAP  localization,  classification  of  K-means  clustering  will  be  the  only  focus  of  multiple 
stationary  and  moving  targets.  With  all  experiments,  parameters  used  to  generate  the  results 
are  defined  and  recorded.  Network  calibration  was  completed  prior  to  each  experiment 
recording,  which  is  considered  the  baseline  for  the  network  without  any  targets  present. 
This  baseline  can  include  obstructions  that  would  essentially  be  calibrated  out  as  the 
obstructions  are  not  considered  targets  of  interest  in  this  research.  For  all  tracking  data,  a 
metronome  was  used  synchronize  movement  and  the  rate  at  which  the  frames  are  recorded. 

3.9  Data  Analysis 

All  experiment  data  and  processing  analysis  was  accomplished  using  MATLAB®. 
Although,  preliminary  data  can  be  viewed  in  near  real-time  through  the  GUI,  data 
processing  for  all  experiments  were  completed  post  data  collection.  All  analysis  for  single 
and  multiple  targets  were  completed  using  the  same  process  with  one  exception.  Single 
target  K-means  localization  is  able  to  be  compared  to  MAP  localization  while  multiple 
targets  are  only  tracked  using  K-means  clustering. 

3.9.1  Experimental  Challenges. 

There  are  some  experimental  challenges  that  need  to  be  overcome  when  utilizing  the 
RTI  motes.  During  data  collection,  the  motes  can  give  Not  a  Number  (NaN)  readings  for 
various  RSS  links  in  the  y  data.  Steps  must  be  accomplished  to  successfully  solve  for  the 
image  scene  of  a  frame. 

Hardware  Challenges.  Upon  investigation,  it  was  found  that  the  the  number  of  links 
to  have  a  NaN  in  any  particular  frame  were  less  than  7  percent  of  the  total  RSS  vector. 
However,  there  are  frames  that  did  not  have  any  NaN  readings,  but  when  there  were  NaNs, 
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in  order  to  solve  for  the  image  seene,  ’kjo,,  the  NaN  data  must  be  removed  from  the  y  veetor. 
Let  the  number  of  links  in  y  be  then  number  of  nominal  links  minus  the  number  of  NaN 
links,  Lf^aN-  The  new  number  of  links  in  y  would  be 

L'  =  L  -  Lj^aN,  (3.9) 

where  the  veetor  of  RSS  links  would  now  be  of  length  [U  x  1].  In  order  to  aeeomplish  this, 
the  loeation  of  the  NaN  readings  must  be  deleted  from  the  eorresponding  loeation  in  the 
veetor  y.  To  sueeessfully  solve  for  Xnk,  the  eorresponding  rows  of  the  NaN  loeations  would 
need  to  be  deleted  from  the  weighting  matrix,  Wune-  The  new  size  of  the  weighting  matrix 
would  be  [U  x  P]  rather  than  [L  x  P].  By  deleting  the  respeetive  row  loeation  in  Wune 
eontaining  NaNs,  the  modified  weight  matrix  ean  be  represented  as  W'.  Using  Tikhonov 
Regularization  as  the  ehosen  image  estimator  outlined  in  Seetion  3.4,  the  image  estimate 
ean  mathematieally  be  defined  as 

x™  =  ((W')^W'  +  aQ)“  W'y',  (3.10) 

where 

n™  =  ((WyW'  +  aQ)“  W',  (3.11) 

x™  =  n).,,y'.  (3.12) 

Negative  Pixel  Density.  Sinee  the  veetor  of  y  ean  eontain  negative  RSS  when 
measuring  the  differenees  in  RSS  links,  it  is  possible  for  x  to  eontain  negative  pixel  density 
values.  In  [42],  Martin  et  al.  deseribe  a  method  to  foree  an  x  eontaining  only  positive 
values.  The  other  alternative  is  to  assume  negative  x  values  are  the  same  as  being  elose  to  a 
value  of  0  dB/ft.  It  is  eomputationally  eheap  to  set  any  negative  x  entries  to  0.  Thus  in  this 
researeh,  all  negative  x  entries  will  be  set  to  0. 

3.9.2  Performance  Metrics. 

Estimated  positions  ean  be  drawn  from  the  x  data  from  the  RTI  network  using  the 
diseussed  loealization  methods.  To  evaluate  the  aeeuraey  of  the  loeation  estimate  for  all  the 
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targets,  an  accuracy  metric  can  be  used.  The  RMSE  of  the  localized  estimate  is  commonly 
used  in  the  literature  [2],  The  RMSE  of  one  or  more  targets  targets  can  mathematically  be 
computed  as 


/l  1  r  A. 


,1/2 


t=l  n=l  / 

where  T  is  the  number  or  targets,  N  is  the  number  of  frames,  Zt{n)  is  the  estimated  position 


for  target  t  at  frame  n,  and  Zt{n)  is  the  true  position  of  target  t  at  frame  n. 


3.10  Chapter  Summary 

This  chapter  described  the  tools  and  equipment  used  for  all  experiments  completed  in 
this  research.  The  methodologies  used  to  establish  the  network  design  and  localization  of 
targets  are  established  in  this  chapter.  Simulated  truth  data,  baseline  data  collection,  and 
the  methods  used  to  analyze  data  have  established. 
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IV.  Results  and  Discussion 


This  chapter  contains  the  results  of  stationary  localization  for  single  and  multiple  tar¬ 
gets  as  well  as  motion  tracking  with  both  obstructed  and  unobstructed  environments. 
The  use  of  K-means  clustering  is  utilized  to  localize  the  targets.  The  classification  of  the 
K-means  clustering  algorithm  to  geolocate  multiple  targets  is  discussed  in  this  section.  The 
focus  will  be  to  characterize  the  use  of  K-means  for  single  and  multiple  targets  localization. 
RF  absorbing  foam  boxes  are  used  as  targets  inside  the  designed  RTI  network  for  the  ex¬ 
perimental  truth  images.  Stationary  target  localization  experiments  are  with  one,  two,  and 
three  targets.  The  results  of  motion  tracking  with  one  and  two  targets  are  presented  with 
and  without  obstructions.  The  results  of  this  section  will  be  discussed  and  classified  using 
the  performance  metrics  discussed  in  Section  3.9.2. 

4.1  Experimental  Truth  Images 

A  series  of  experimental  truth  images  using  foam  boxes  were  used  to  obtain  a 
visual  performance  baseline  of  the  designed  RTI  network  described  in  Section  3.2.  The 
dimensions  of  the  foam  box  was  such  that  it  would  be  tall  enough  to  be  in  the  LOS  of  the 
sensor  height  with  an  overall  dimension  [L  x  VF  x  //]  of  approximately  [2. 15  x  2. 15  x  3.45] 
ft.  The  goal  of  the  truth  images  is  to  gather  a  baseline  performance  of  the  RTI  network 
by  clustering  the  pixels  above  the  threshold  Tc  using  K-means  after  solving  for  the  image 
scene  x. 

The  foam  boxes  were  moved  to  different  parts  of  the  network.  Calibration  was 
completed  prior  to  placing  the  foam  box  target  inside  the  network.  As  outlined  in  Section, 
3.3,  calibration  was  completed  for  at  least  30  frames.  In  Fig.  4.1  and  Fig.  4.2,  the  lower 
left  comer  of  the  box  was  placed  at  (3, 10)  and  (2, 2)  respectively.  All  pixels  with  densities 
above  the  threshold  =  3cr„  are  plotted  in  Fig.  4.1c  and  Fig.  4.2c.  Most  of  the  pixel 
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locations  are  contained  inside  the  foam  box  perimeter  resulting  in  the  cluster  centroids  to 
be  contained  inside  the  box  perimeter  after  performing  one  iteration  of  K-means  clustering. 

'j 

The  cluster  variance  for  Fig.  4.1  and  Fig.  4.2  was  13.6  and  6.3  ft  respectively. 

In  Fig.  4.3,  the  bottom  left  corner  of  the  box  was  placed  at  (8, 8).  The  pixels  above  the 
threshold  Tc  =  3(t„  were  further  spread  out  than  Fig.  4.1  and  Fig.  4.2  resulting  in  a  higher 
variance  of  24.8  ff.  However,  K-means  clustering  found  the  centroid  to  be  near  the  center 
of  mass  of  the  foam  box.  As  described  in  Section  2.5,  the  CRLB  derived  in  [1]  showed 
that  the  CRLB  is  the  lowest  towards  the  middle  of  the  network  and  higher  near  the  corners. 
With  the  experimental  images  taken  in  this  research,  the  opposite  conclusion  was  formed. 
Images  with  targets  near  the  corners  of  the  network  appeared  to  have  denser  clusters  than 
when  targets  were  near  the  center  of  the  network. 
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(b)  Histogram  of  the  image. 


X[ft] 


Figure  4.1:  Truth  image  of  foam  box  with  the  bottom  left  eorner  at  (3,10)  ft  with  a  =  250, 
Ap  =  0.5  ft,  and  Tc  =  3cr„ 
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(a)  Image  scene. 


(b)  Histogram  of  the  image. 
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(c)  K-means  clustering. 


(d)  Image  scene  after  K-means  clustering. 


Figure  4.2:  Truth  image  of  foam  box  with  the  bottom  left  eomer  at  (2, 2)  ft  with  a  =  250, 
Ap  =  0.5  ft,  and  Tc  =  3cr„ 
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(c)  K-means  clustering. 


(d)  Image  scene  after  K-means  clustering. 


Figure  4.3:  Truth  image  of  foam  box  with  the  bottom  left  eomer  at  (8, 8)  ft  with  a  =  250, 
Ap  =  0.5  ft,  and  Tc  =  3cr„ 
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4.2  Stationary  Target  Localization 

Multiple  localization  experiments  were  done  with  one,  two,  and  three  human  targets. 
K-means  localization  is  used  to  geolocate  the  position  of  the  target(s)  inside  the  network. 
With  one  target,  K-means  clustering  can  be  compared  with  MAP  localization  where  the 
pixel  with  the  highest  density  is  chosen  to  be  estimated  target  position.  In  all  experimental 
localization,  one  frame  based  on  one  observation  of  y  is  used  to  localize  the  target(s)  inside 
the  network.  The  purpose  of  this  section  is  to  characterize  the  results  of  K-means  clustering. 
The  K-means  clustering  process  used  to  localize  the  targets  is  outlined  in  this  section.  All 
localization  is  completed  with  2  iterations  of  K-means  clustering. 

The  objective  of  this  section  is  to  determine  in  which  cases  one  iteration  of  K-means 
would  be  sufficient  and  in  which  cases  2  iterations  would  be  beneficial.  This  section 
examines  how  noise  or  outliers  can  affect  the  results  of  K-means  clustering.  This  section 
will  examine  the  results  of  changing  the  parameters  such  as  the  pixel  threshold  and  pixel 
size. 

4.2.1  Single  Target  Stationary  Localization. 

In  Fig.  4.4,  there  is  a  human  target  at  (5, 5)  ft.  After  the  pixel  locations  above  the 
threshold  =  Scr^  are  kept,  the  first  K-means  iteration  is  performed.  The  error  for  the 
frame  was  e  =  1.43  ft  after  the  first  iteration  of  K-means  clustering.  After  the  second 
iteration  of  K-means  clustering,  e  =  0.38  ft.  In  this  situation,  there  were  isolated  pixel 
values  over  10  ft  from  the  target  position  that  were  above  the  pixel  density  threshold.  This 
caused  the  centroid  center  to  be  biased.  Therefore,  since  the  second  K-means  iterations 
discarded  the  pixel  locations  outside  the  Rc  =  3.25  ft  radius,  the  isolated  pixels  did  not 
affect  the  new  cluster  centroid  location.  The  final  estimate  for  the  target  was  closer  to  the 
dense  group  of  pixels  around  the  true  target  location. 

In  Fig.  4.6,  a  comparison  is  made  between  MAP  and  K-means  localization  for  a  single 
target  at  (9,  8)  ft.  This  location  was  chosen  for  this  comparison  due  to  the  pixel  density 
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of  the  image  being  spread  out  when  a  target  is  closer  to  the  middle  of  the  network.  In 
this  particular  scenario,  the  maximum  pixel  density  is  3.54  ft  away  from  the  true  position. 
After  the  K-means  clustering  localization,  the  localization  error  for  the  frame  is  e  =  0.69 
ft.  In  this  case,  K-means  was  more  accurate  because  the  highest  pixel  density  value  was 
more  than  further  away  from  the  target  position  than  the  K-means  cluster.  The  cluster  of 
pixels  above  the  threshold  congregated  around  the  target  causing  the  attenuation  shown  in 
the  image  scene.  As  a  result,  the  K-means  clustering  localization  computed  a  centroid  near 
the  target  position. 

4.2.2  Multiple  Target  Stationary  Localization. 

In  Fig.  4.7,  2  targets  are  at  (3, 11)  and  (12,4)  ft.  The  error  for  one  frame  after  both 
K-means  iterations  was  e  =  0.78  ft.  There  was  no  change  in  cluster  centroid  positions 
because  all  the  clustered  pixels  were  inside  the  Rc  =  3.25  ft  radius  after  the  first  K-means 
clustering  iteration.  Therefore,  there  was  no  change  to  the  RMSE  after  the  second  iteration. 

In  Fig.  4. 1 1,  with  3  targets  at  (2, 2),  (5, 1 1),  and  (17, 14)  ft,  the  the  error  was  e  =  0.28. 
Similar  to  Fig.  4.7,  the  RMSE  did  not  change  because  the  pixels  above  the  threshold  were 
inside  the  Rc  =  3.25  ft  radius  after  this  hrst  iteration  of  K-means  clustering.  In  Eig.  4. 1 1  the 
initial  K-means  iteration  picked  a  centroid  containing  pixels  segregated  outside  the  radius 
of  the  centroid  near  (17,4).  This  caused  the  pixels  near  the  (5, 11)  target  to  be  grouped 
with  the  pixel  cluster  around  the  (2, 2)  target  causing  the  estimates  for  both  targets  to  be 
errant.  Since  the  initial  K-means  cluster  found  centroids  more  than  3.25  ft  away  from  the 
true  target  positions  for  both  of  these  targets,  a  second  iteration  would  not  introduce  an 
improved  result.  Recall  that  Section  2.7  describes  a  drawback  to  K-means  clustering;  this 
process  does  not  guarantee  a  global  optimum.  How  the  centroid  is  first  calculated  can  have 
a  significant  impact  on  the  outcome. 

In  Eig.  4.13,  an  experiment  was  performed  with  the  same  target  locations  as  Eig. 
4.11.  Eor  this  localization,  parameters  were  changed  such  that  Ap  =  .25  ft,  a  =  150,  and 
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Tc  =  2cr„.  The  lower  pixel  width  results  in  a  increased  amount  of  pixels.  Additionally,  a 
lower  threshold  value,  Tc  =  2cr„  results  in  an  increased  amount  of  pixels  above  threshold. 
As  seen  in  Fig.  4.14,  there  are  more  pixel  locations  to  cluster  than  seen  in  Fig.  4.12. 
A  denser  cluster  of  pixels  around  the  target  position  increases  the  opportunity  to  hnd  a 
centroid  among  the  denser  crowd  of  pixels  which  are  congregated  around  the  respective 
targets.  The  smaller  number  of  isolated  pixels  would  have  a  insignificant  affect  on  the 
centroid  calculation.  The  localization  error  after  the  first  K-means  iteration  was  e  =  .52  ft. 
After  the  second  iteration,  the  error  was  e  =  .472  ft. 
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(c)  Histogram  of  the  image. 


(d)  Pixel  locations  above  threshold 
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(e)  Target  localization  after  hrst  iteration 

Figure  4.4:  Localization  of  1  target 
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(f)  Target  localization  after  second  iteration 


(5, 5)  ft  with  a  =  250  and  Ap  =  0.5  ft. 
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Figure  4.5:  K-means  Localization:  target  at  (5,5)  ft.  The  RMSE  after  the  first  K-means 
iteration  was  e  =  1.43  ft.  After  the  second  iteration,  the  RMSE  was  e  =  0.38  ft. 
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(c)  Maximum  pixel  intensity  localization. 


(d)  K-means  localization. 


Figure  4.6:  Localization  of  1  target  at  (9,  8)  ft  with  or  =  250,  =  0.5  ft,  and  Tc  =  3cr„. 

The  RMSE  for  the  maximum  pixel  intensity  localization  was  e  =  3.54  ft.  The  RMSE  for 
the  K-means  clustering  localization  was  e  =  0.69  ft. 
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(c)  Histogram  of  the  image. 
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(d)  Pixel  locations  above  threshold 
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(e)  Target  localization  after  hrst  iteration 


(f)  Target  localization  after  second  iteration 


Figure  4.7:  Localization  of  2  targets  at  (3, 1 1)  and  (12, 4)  ft  with  a  =  250  and  Ap  =  0.5  ft. 


62 


1  y  y  y  y  y  yyyivi  kim  kiw  v7 


V 

n 

9 


Wireless  Nodes 
k=  1 
k  =  2 

Cluster  Centroid 
True  Position 
Cluster  Radius 


t 

I 

\ 


Ok 


10 
X[ft] 


15 


(a)  First  iteration  of  K-means  clustering. 
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(b)  Second  iteration  of  K-means  clustering. 

Figure  4.8:  K-means  Localization:  2  targets  at  (3, 11)  and  (12,4)  ft.  The  RMSE  after  the 
first  K-means  iteration  was  e  =  0.78  ft.  After  the  second  iteration,  the  RMSE  was  e  =  0.78 
ft.  There  was  no  change  due  to  the  same  pixels  being  clustered. 
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(c)  Histogram  of  the  image. 


(d)  Pixel  locations  above  threshold. 
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(e)  Target  localization  after  first  iteration. 


(f)  Target  localization  after  second  iteration 


Figure  4.9:  Localization  of  3  targets  at  (2,2),  (5, 11),  and  (17, 14)  ft  with  a  =  250  and 
Ap  =  0.5  ft. 
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(a)  First  iteration  of  K-means  clustering. 
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(b)  Second  iteration  of  K-means  clustering. 

Figure  4.10:  K-means  Localization:  targets  at  (2,2),  (5, 11),  and  (17, 14)  ft.  The  RMSE 
after  the  first  K-means  iteration  was  e  =  0.28  ft.  After  the  second  iteration,  the  RMSE  was 
e  =  0.28  ft. 
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(d)  Pixel  locations  above  threshold. 
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(e)  Target  localization  after  first  iteration. 


(f)  Target  localization  after  second  iteration 


Figure  4.11:  Localization  of  3  targets  at  (2,2),  (5, 11),  and  (17, 14)  ft  with  a  =  250  and 
Ap  =  0.5  ft. 
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(a)  First  iteration  of  K-means  clustering. 
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(b)  Second  iteration  of  K-means  clustering. 

Figure  4.12:  3  targets  at  (2,2),  (5, 11),  and  (17, 14)  ft.  The  RMSE  after  the  first  K-means 
iteration  was  e  =  5.17  ft.  After  the  second  iteration,  the  RMSE  was  e  =  4.75  ft. 
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(c)  Histogram  of  the  image. 


(d)  Pixel  locations  above  threshold. 
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(e)  Target  localization  after  first  iteration. 


(f)  Target  localization  after  second  iteration 


Figure  4.13:  Localization  of  3  targets  at  (2,2),  (5, 11),  and  (17, 14)  ft  with  a  =  150, 
Tc  =  2cr„,  and  Ap  =  0.25  ft. 
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(a)  First  iteration  of  K-means  clustering. 
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(b)  Second  iteration  of  K-means  clustering. 

Figure  4.14:  3  targets  at  (2,2),  (5, 11),  and  (17, 14)  ft.  The  RMSE  after  the  first  K-means 
iteration  was  e  =  .52  ft.  After  the  second  iteration,  the  RMSE  was  e  =  .472  ft. 
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4.3  Motion  Tracking 

A  series  of  motion  tracking  experiments  were  performed  in  this  research.  Since  3 
target  localization  can  be  inaccurate,  all  motion  tracking  experiments  were  done  with  1 
and  2  targets.  The  problem  with  3  targets  is  that  it  is  more  susceptible  to  mistake  outliers 
as  a  target  as  shown  in  Section  4.2.2.  Although  localization  with  obstructions  was  not 
the  objective  of  this  research,  obstructions  were  used  to  portray  real  indoor  scenarios. 
With  1  target,  K-means  localization  is  able  to  be  compared  to  the  maximum  pixel  density 
estimator.  For  all  motion  tracking,  2  iterations  of  K-means  are  performed,  where  the  cluster 
centroid(s)  are  considered  the  estimated  target  positions.  For  2  targets,  the  goal  was  to 
analyze  if  they  target  can  be  successfully  localized  if  the  targets  are  standing  close  together, 
such  as  within  the  Rc  =  3.25  ft  radius. 

4.3.1  Single  Target  Motion  Tracking. 

In  Fig.  4.15,  a  foam  wall  was  setup  where  each  wall  had  a  dimension  of  [Lx  WxH]  « 
[6x2x6]  ft.  The  walls  were  setup  to  simulate  a  hallway  with  a  width  of  approximately  4 
ft.  The  target  walked  through  the  simulated  hallway  at  which  the  target’s  position  was 
estimated  at  each  position  using  the  maximum  pixel  density  and  K-means  clustering. 
Although  the  objective  of  the  calibration  is  to  neutralize  any  obstructions  inside  the 
network,  obstructions  made  the  image  noisier.  Due  to  the  image  being  nosier,  at  some 
frames,  the  maximum  pixel  density  was  further  away  from  the  true  target  position,  but 
the  cluster  of  pixels  above  the  threshold,  Tc  would  be  closer  to  the  target  position.  This 
resulted  in  a  RMSE  ofe  =  2.68  ft  for  the  maximum  pixel  density  localization  and  K-means 
localization  had  a  RMSE  of  e  =  0.91  ft.  Eig.  4.18  illustrates  a  square  motion  tracking 
path  for  one  target.  Similarly,  K-means  localization  had  a  lower  RMSE  than  the  maximum 
pixel  density  localization  due  to  frames  that  contained  maximum  pixel  density  which  were 
further  away  than  the  clusters  of  higher  density  pixels  around  the  target  position. 
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4.3.2  Two  Target  Motion  Tracking. 

The  results  for  motion  target  tracking  with  two  targets  using  K-means  localization  are 
presented  in  Fig.  4.20.  To  simulate  an  indoor  environment  with  obstructions,  two  chairs 
with  an  approximate  aerial  dimension  of  [2  x  2]  ft  were  placed  centered  at  (4, 11)  and 
(14, 5)  ft.  The  main  objective  of  this  motion  path  was  to  analyze  the  performance  of  K- 
means  when  the  targets  end  up  less  than  Rc  =  3.25  ft  away  from  one  another.  For  this 
motion  tracking  path,  K-means  localization  had  a  RMSE  e  =  0.59.  When  the  targets  were 
within  2  ft  from  one  another,  the  RMSE  for  that  frame  was  less  than  1  ft. 
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(a)  Motion  path  positions. 


(b)  Localization  estimate  at  each  position. 

Figure  4.15:  Motion  tracking  localization  of  1  target  going  through  a  simulated  hallway 
with  walls  inside  the  network  MAP  localization.  The  pixel  with  the  highest  density  value 
was  used  to  geolocate  the  target  at  each  frame  with  a  =  250  and  Ap  =  0.5  ft. 
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(a)  Motion  path  positions. 
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(b)  Localization  estimate  at  each  position. 

Figure  4.16:  Motion  tracking  localization  of  1  target  going  through  a  simulated  hallway 
with  walls  inside  the  network.  K-means  localization  was  used  to  geolocate  the  target  at 
each  frame  with  Tc  =  3cr„,  a  =  250,  and  =  0.5  ft. 
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Error  -  X  [ft] 


(a)  X  direction  error  at  each  frame.  (b)  Y  direction  error  at  each  frame. 


(c)  RMSE  at  each  frame. 

Figure  4. 17:  The  RMS E  for  the  target  over  the  motion  traeking  path  through  the  walls  was 
e  =  2.68  ft  using  the  maximum  pixel  density  estimate.  The  RMS  E  for  the  target  over  the 
motion  traeking  path  through  the  walls  was  e  =  0.91  ft  using  K-means  loealization. 
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(a)  Max  pixel  density  localization. 
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(b)  K-means  clustering  localization. 


(c)  True  position  of  target  at  each  frame. 

Figure  4.18:  A  single  target  moves  throughout  the  network  in  a  square  starting  at  (8,  8) 
and  ending  at  (5,  8)  ft.  Maximum  pixel  density  and  K-means  clustering  are  both  used  to 
localize  the  target  position  for  comparison. 
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(a)  X  direction  error  at  each  frame. 


(b)  Y  direction  error  at  each  frame. 


(c)  RMSE  at  each  frame. 

Figure  4.19:  The  RMSE  for  the  target  over  the  motion  traeking  path  from  Fig.  4.18  using 
maximum  pixel  density  loealization  was  e  =  1 .54  ft.  The  RMS  E  for  the  target  over  the 
motion  tracking  path  from  Fig.  4.18  using  K-means  clustering  localization  was  e  =  0.77  ft. 
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(a)  Motion  path  positions. 
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(b)  Localization  estimates  at  each  position. 

Figure  4.20:  Motion  tracking  localization  of  2  targets  with  obstructions, 
clustering  was  used  to  geolocate  the  targets  at  each  frame  with  Tc  =  3cr„,  a 
Ap  =  0.5  ft. 


K-means 
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(a)  Euclidean  distance  error  -  x  direction  at  each  frame. (b)  Euclidean  distance  error  -  y  direction  at  each  frame. 


(c)  RMSE  at  each  frame. 

Figure  4.21:  The  RMS  E  for  the  two  targets  over  the  motion  traeking  path  with  obstruetions 
was  e  =  .59  ft. 
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4.4  Chapter  Summary 

This  chapter  reviewed  the  results  of  various  stationary  localization  with  one,  two, 
and  three  targets.  With  1  target  localization,  K-means  clustering  was  able  to  be 
compared  to  maximum  pixel  density  localization  to  show  in  which  situation  K-means 
could  be  more  robust  in  estimating  the  target  location.  Localization  for  two  targets  was 
more  consistent  than  localization  with  three  targets.  Motion  tracking  experiments  were 
performed  in  a  variety  of  different  simulated  situations  to  analyze  the  performance  of  K- 
means  localization.  The  choice  of  pixel  threshold  and  number  of  pixels  above  the  threshold 
play  a  key  role  in  determining  which  pixels  of  interest  will  be  clustered  together  to  localize 
the  targets  inside  the  network. 
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V.  Conclusion  and  Future  Work 


This  chapter  summarizes  the  methodology,  results,  and  conclusions  made  from  this 
thesis  as  well  as  provides  a  recommendation  for  future  work.  With  the  growing 
research  interest  of  RTI,  single  target  localization  has  been  the  primary  focus.  This  research 
was  motivated  to  explore  multiple  target  localization  for  situations  where  localizing  more 
than  one  target  would  be  beneficial.  Additionally,  this  research  had  the  desire  to  look  at  a 
robust  means  of  clustering  together  pixels  of  higher  densities  as  opposed  to  only  using  the 
pixel  with  the  highest  density. 

This  thesis  explored  a  new  means  to  localize  multiple  targets  using  a  pixel  threshold  to 
localize  pixels  above  a  certain  pixel  density.  The  methods  in  which  the  image  frames  were 
estimated  were  presented  in  Chapter  2.  The  use  of  the  weighting  model,  regularization, 
and  image  estimate  have  been  explored  and  commonly  used  in  RTI  research.  Practical 
applications  with  movement,  obstructions,  through  the  wall,  and  outdoor  environments 
have  been  explored  by  the  research  community.  The  challenge  has  been  estimating  the 
position  of  more  than  one  target  [2]. 

K-means  clustering  is  a  known  algorithm  used  in  other  data  mining  applications 
such  as  among  machine  learning  applications,  pattern  recognition,  hyper- spectral  imagery, 
artificial  intelligence,  crowd  analysis,  and  MTT  [11],  [12],  [13].  However,  the  method  in 
how  it  is  applied  is  new  to  the  RTI  research  community.  The  objective  of  the  pixel  density 
threshold  is  to  be  robust  enough,  that  as  the  statistics  of  each  frame  change  with  the  number 
of  targets  and  where  the  targets  are  in  the  network,  the  threshold  would  segregate  pixels  of 
higher  densities.  These  pixels  would  relate  to  where  the  targets  are  inside  the  network. 
Additionally,  with  a  radius  to  ignore  pixels  outside  a  set  distance  from  the  initial  centroids, 
possible  errant  pixel  densities  could  be  ignored.  Since  the  computational  cost  of  K-means 


80 


is  relatively  low  as  discussed  in  Chapter  2,  two  iterations  of  K-means  could  potentially 
produce  more  accurate  localization  estimates  than  one. 

In  a  series  of  stationary  localization  experiments,  this  research  was  able  to  analyze  the 
performance  of  K-means  for  one,  two,  and  three  targets  in  an  indoor  network.  For  one  target 
localization,  if  the  maximum  pixel  density  was  further  away  from  the  target  position  than 
the  cluster  of  pixels  above  the  threshold  were,  K-means  localization  was  more  accurate. 
K-means  performed  equally  as  accurate  as  single  target  localization.  However,  if  there 
were  no  pixels  outside  the  cluster  radius,  a  second  iteration  of  K-means  did  not  produce 
any  change  in  results  as  the  centroid  locations  would  understandably  stay  the  same.  Three 
target  localization  was  found  to  be  inaccurate.  Changing  the  pixel  width,  Ap  to  be  lower, 
which  increased  the  number  of  pixels  in  the  network  and  lowering  the  pixel  threshold,  Tc 
provided  a  larger  amount  of  pixels  to  be  clustered,  which  increased  the  performance  of 
K-means.  A  higher  amount  of  pixels  were  found  to  aid  K-means  clustering  in  hnding  a 
solution  that  minimized  the  inter  cluster  variance. 

For  motion  tracking  images,  the  image  scene  estimates  were  found  to  be  noisier.  Thus, 
for  both  single  target  tracking  situations  K-means  performed  more  accurately  than  using  the 
highest  pixel  density  to  localize  the  target.  For  two  targets,  K-means  was  able  to  localize 
two  targets  that  moved  towards  one  another.  When  the  targets  were  approximately  2  ft  from 
each  other,  the  RMSE  at  that  frame  was  less  than  1  ft.  Due  to  three  target  localization  not 
being  as  accurate  and  having  a  network  limited  in  size,  three  target  localization  for  motion 
tracking  was  not  performed. 

This  research  showed  that  K-means  can  be  applied  to  one  or  more  targets  in  a  RTI 
network.  Further  work  is  recommended  to  improve  the  process  and  make  it  more  robust 
for  multiple  targets.  The  future  work  section  has  recommendations  on  future  research  areas 
that  can  expand  on  this  research  and  other  similar  research  areas  of  RTI. 
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5.1  Future  Work 


Automatic  Target  Recognition.  For  K-means  to  be  successful,  the  number  of  targets 
needs  to  be  known.  In  this  research,  it  is  assumed  the  number  of  targets  are  known  prior 
to  estimating  the  locations  of  the  targets.  Implementing  a  method  to  estimate  the  number 
of  targets  can  be  beneficial  in  applications  where  the  number  of  targets  may  not  be  known 
[11],  [12]. 

K-Medoids  Clustering.  K-means  does  not  guarantee  to  return  a  global  optimum.  The 
final  solution  is  sensitive  to  the  initial  set  of  clusters  [13].  Although  computationally  more 
expensive,  using  K-medoids  could  potentially  be  a  viable  solution  for  MTT  with  RTF  K- 
medoids  could  potentially  be  more  robust  due  to  minimizing  the  sum  of  general  pairwise 
dissimilarities,  which  could  minimize  the  negative  effects  of  noise  and  outliers  [57].  The 
trade  off  between  performance  and  computational  complexity  could  be  examined. 

Adaptive  Filter  Tracking.  This  research  did  not  use  any  adaptive  filters  to  track 
moving  targets.  The  use  of  adaptive  filters  such  as  the  Kalman  filter  or  Gaussian  particle 
filter  could  minimize  the  RMSE.  The  use  of  adaptive  filers  would  minimize  the  effects  of 
observation  noise  or  other  variables  that  could  lead  to  inaccurate  position  estimates  [10], 
[11],  [12],  [58]. 
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